CPS: Synergy: Doing More With Less: Cost-Effective Infrastructure for Automotive Vision Capabilities
Many safety-critical cyber-physical systems rely on advanced sensing capabilities to react to chang- ing environmental conditions. However, cost-effective deployments of such capabilities have remained elusive. Such deployments will require software infrastructure that enables multiple sensor-processing streams to be multiplexed onto a common hardware platform at reasonable cost, as well as tools and methods for validating that required processing rates can be maintained.
The choice of hardware platform to utilize in autonomous vehicles is not straightforward. One choice that is receiving considerable attention today is the usage of energy-efficient multicore platforms equipped with graphics processing units (GPUs) that can speed up mathematical computations inherent to signal processing, image processing, motion planning, etc. One of the prominent platforms today is NVIDIA’s Jetson TX2. However, managing an embedded GPU requires an accurate model of GPU behavior, which is especially difficult to obtain for closed-source hardware and software like the Jetson TX2. Therefore, our work requires first developing such a model, and then applying the model to develop new management techniques.
We began our effort to design a model of GPU behavior using black-box experiments to infer schedul- ing rules for the Jetson TX2. These experiments made use of a new experimental framework and visual- ization framework, through which we were able to specify a set of rules governing how the closed-source GPU scheduler internally prioritizes and orders work from multiple tasks. Using these rules, we were able to prove that unbounded response times are possible in an unmanaged shared-GPU system. Furthermore, our experimental framework and preliminary rules allowed us to predict and quantify harmful blocking behavior due to implicit GPU synchronization. Implicit synchronization can be triggered by common functions in NVIDIA’s CUDA API, and can cause unrelated GPU workloads to be block one another, regardless of the GPU’s available computing capacity.