Visible to the public A hierarchical approach to self-test, fault-tolerance and routing security in a Network-on-Chip

TitleA hierarchical approach to self-test, fault-tolerance and routing security in a Network-on-Chip
Publication TypeConference Paper
Year of Publication2019
AuthorsRavikumar, C.P., Swamy, S. Kendaganna, Uma, B.V.
Conference Name2019 IEEE International Test Conference India (ITC India)
Keywordsassociated physical channels, bus interconnects, chip multiprocessors, communication efficiency, computer network reliability, computer network security, deadlock situation, deadlock-free properties, denial-of-service attacks, external source, fault data, fault tolerant computing, fault-information, fault-tolerance aspects, fault-tolerant routing, flit-switching, hierarchical approach, Internet, local processing element, local router, local self-test manager, malformed packets, malicious denial-of-service attack, malicious external agent, Metrics, microprocessor chips, multiprocessing systems, network bandwidth, network on chip security, network-on-chip, NoC, on-chip networks, packet switching, packet-switching, power virus, resilience, Resiliency, routing agent, routing security, Scalability, security concerns, sorting-based algorithm, telecommunication network routing, test algorithms, two-tier approach, two-tier solution, virtual channel flow control, virtual channels
AbstractSince the performance of bus interconnects does not scale with the number of processors connected to the bus, chip multiprocessors make use of on-chip networks that implement packet switching and virtual channel flow control to efficiently transport data. In this paper, we consider the test and fault-tolerance aspects of such a network-on-chip (NoC). Past work in this area has addressed the communication efficiency and deadlock-free properties in NoC, but when routing externally received data, aspects of security must be addressed. A malicious denial-of-service attack or a power virus can be launched by a malicious external agent. We propose a two-tier solution to this problem, where a local self-test manager in each processing element runs test algorithms to detect faults in local processing element and its associated physical and virtual channels. At the global level, the health of the NoC is tested using a sorting-based algorithm proposed in this paper. Similarly, we propose to handle fault-tolerance and security concerns in routing at two levels. At the local level, each node is capable of fault-tolerant routing by deflecting packets to an alternate path; when doing so, since a chance of deadlock may be created, the local router must be capable of guestimating a deadlock situation, switch to packet-switching instead of flit-switching and attempt to reroute the packet. At the global level, a routing agent plays the role of gathering fault data and provide the fault-information to nodes that seek this information periodically. Similarly, the agent is capable of detecting malformed packets coming from an external source and prevent injecting such packets into the network, thereby conserving the network bandwidth. The agent also attempts to guess attempts at denial-of-service attacks and power viruses and will reject packets. Use of a two-tier approach helps in keeping the IP modular and reduces their complexity, thereby making them easier to verify.
Citation Keyravikumar_hierarchical_2019