Bringing the Multicore Revolution to Safety-Critical Cyber-Physical Systems
Abstract:
Shared hardware resources like caches and memory introduce timing unpredictability for real-time systems. Worst-case execution time (WCET) analysis with shared hardware resources is often so pessimistic that the extra processing capacity of multicore systems is negated. We propose techniques to improve performance and schedulability for multicore systems. Modern non-uniform memory access (NUMA) multicore CPUs partition sets of cores into a “node” with a local memory controller such that memory accesses may be resolved locally or via the on-chip interconnect from a remote node and its memory. Each controller governs multiple banks to increase memory parallelism. On multicore platforms, memory access latencies vary significantly depending on which node data is located on and how banks are shared. Data allocations without locality awareness may experience high memory latencies so that execution times may become highly unpredictable in a multicore real-time system resulting in loose bounds on the WCET of tasks and overly conservative scheduling with low utilization. This work contributes a memory controller/node-aware allocator for heap allocation, which comprehensively considers memory node and bank locality to color the main memory space without requiring hardware modifications. The new allocator dynamically assigned memory space transparently to each task such that remote node accesses are avoided. This reduces conflicts in memory accesses and latency and effectively isolates a task’s timing effects with respect to other tasks via controller- and bank-accesses partitioning in software. Experiments on a multicore platform with the NAS and Parsec benchmarks indicate that controlleraware memory coloring improves performance by reducing memory latency, avoids inter-task conflicts and increased timing predictability, which makes it suitable for mixed criticality (MC), weakly hard and soft real-time systems. Our coloring approach further outperforms the standard buddy allocator as well as prior coloring methods on an x86 platform. It is also the only policy that provides single core equivalence when just one core per memory controller is used. When validating real-time constraints on an m-core platform, excessive analysis pessimism can negate the processing capacity of the additional m-1 cores. To address this problem, two approaches have been investigated previously: mixed-criticality allocation techniques and hardware-management techniques. In this work, we integrate these approaches and assess criticality-cognizant hardware management tradeoffs. We propose an optimized last-level cache (LLC) allocation techniques based on linear programming and hardware-management techniques in terms of LLC and bank isolation for mixed-criticality systems. We conduct a large-scale overhead-aware schedulability study to provide evidence in flavor of combining MC analysis with hardware management. We also conduct run-time experiments to investigate the validity of our provisioning. Experiments with synthetic and Data-intensive systems (DIS) benchmark show that LLC and bank isolation reduces WCETs by up to 242% and an overhead-aware schedulability study shows that MC provisioning improves scheduluability for 68% of the scenarios. Combining both approaches improves schedulability by one to two cores’ worth of additional utilization in most cases.