Visible to the public Biblio

Filters: Keyword is Neural Network Resilience  [Clear All Filters]
Whatmough, P. N., Lee, S. K., Lee, H., Rama, S., Brooks, D., Wei, G. Y..  2017.  14.3 A 28nm SoC with a 1.2GHz 568nJ/prediction sparse deep-neural-network engine with \#x003E;0.1 timing error rate tolerance for IoT applications. 2017 IEEE International Solid-State Circuits Conference (ISSCC). :242–243.

This paper presents a 28nm SoC with a programmable FC-DNN accelerator design that demonstrates: (1) HW support to exploit data sparsity by eliding unnecessary computations (4× energy reduction); (2) improved algorithmic error tolerance using sign-magnitude number format for weights and datapath computation; (3) improved circuit-level timing violation tolerance in datapath logic via timeborrowing; (4) combined circuit and algorithmic resilience with Razor timing violation detection to reduce energy via VDD scaling or increase throughput via FCLK scaling; and (5) high classification accuracy (98.36% for MNIST test set) while tolerating aggregate timing violation rates \textbackslashtextgreater10-1. The accelerator achieves a minimum energy of 0.36μJ/pred at 667MHz, maximum throughput at 1.2GHz and 0.57μJ/pred, or a 10%-margined operating point at 1GHz and 0.58μJ/pred.

Sim, H., Nguyen, D., Lee, J., Choi, K..  2017.  Scalable stochastic-computing accelerator for convolutional neural networks. 2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC). :696–701.

Stochastic Computing (SC) is an alternative design paradigm particularly useful for applications where cost is critical. SC has been applied to neural networks, as neural networks are known for their high computational complexity. However previous work in this area has critical limitations such as the fully-parallel architecture assumption, which prevent them from being applicable to recent ones such as convolutional neural networks, or ConvNets. This paper presents the first SC architecture for ConvNets, shows its feasibility, with detailed analyses of implementation overheads. Our SC-ConvNet is a hybrid between SC and conventional binary design, which is a marked difference from earlier SC-based neural networks. Though this might seem like a compromise, it is a novel feature driven by the need to support modern ConvNets at scale, which commonly have many, large layers. Our proposed architecture also features hybrid layer composition, which helps achieve very high recognition accuracy. Our detailed evaluation results involving functional simulation and RTL synthesis suggest that SC-ConvNets are indeed competitive with conventional binary designs, even without considering inherent error resilience of SC.

Yang, L., Murmann, B..  2017.  SRAM voltage scaling for energy-efficient convolutional neural networks. 2017 18th International Symposium on Quality Electronic Design (ISQED). :7–12.

State-of-the-art convolutional neural networks (ConvNets) are now able to achieve near human performance on a wide range of classification tasks. Unfortunately, current hardware implementations of ConvNets are memory power intensive, prohibiting deployment in low-power embedded systems and IoE platforms. One method of reducing memory power is to exploit the error resilience of ConvNets and accept bit errors under reduced supply voltages. In this paper, we extensively study the effectiveness of this idea and show that further savings are possible by injecting bit errors during ConvNet training. Measurements on an 8KB SRAM in 28nm UTBB FD-SOI CMOS demonstrate supply voltage reduction of 310mV, which results in up to 5.4× leakage power reduction and up to 2.9× memory access power reduction at 99% of floating-point classification accuracy, with no additional hardware cost. To our knowledge, this is the first silicon-validated study on the effect of bit errors in ConvNets.

Marques, J., Andrade, J., Falcao, G..  2017.  Unreliable memory operation on a convolutional neural network processor. 2017 IEEE International Workshop on Signal Processing Systems (SiPS). :1–6.

The evolution of convolutional neural networks (CNNs) into more complex forms of organization, with additional layers, larger convolutions and increasing connections, established the state-of-the-art in terms of accuracy errors for detection and classification challenges in images. Moreover, as they evolved to a point where Gigabytes of memory are required for their operation, we have reached a stage where it becomes fundamental to understand how their inference capabilities can be impaired if data elements somehow become corrupted in memory. This paper introduces fault-injection in these systems by simulating failing bit-cells in hardware memories brought on by relaxing the 100% reliable operation assumption. We analyze the behavior of these networks calculating inference under severe fault-injection rates and apply fault mitigation strategies to improve on the CNNs resilience. For the MNIST dataset, we show that 8x less memory is required for the feature maps memory space, and that in sub-100% reliable operation, fault-injection rates up to 10-1 (with most significant bit protection) can withstand only a 1% error probability degradation. Furthermore, considering the offload of the feature maps memory to an embedded dynamic RAM (eDRAM) system, using technology nodes from 65 down to 28 nm, up to 73 80% improved power efficiency can be obtained.

Jiao, X., Luo, M., Lin, J. H., Gupta, R. K..  2017.  An assessment of vulnerability of hardware neural networks to dynamic voltage and temperature variations. 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). :945–950.

As a problem solving method, neural networks have shown broad applicability from medical applications, speech recognition, and natural language processing. This success has even led to implementation of neural network algorithms into hardware. In this paper, we explore two questions: (a) to what extent microelectronic variations affects the quality of results by neural networks; and (b) if the answer to first question represents an opportunity to optimize the implementation of neural network algorithms. Regarding first question, variations are now increasingly common in aggressive process nodes and typically manifest as an increased frequency of timing errors. Combating variations - due to process and/or operating conditions - usually results in increased guardbands in circuit and architectural design, thus reducing the gains from process technology advances. Given the inherent resilience of neural networks due to adaptation of their learning parameters, one would expect the quality of results produced by neural networks to be relatively insensitive to the rising timing error rates caused by increased variations. On the contrary, using two frequently used neural networks (MLP and CNN), our results show that variations can significantly affect the inference accuracy. This paper outlines our assessment methodology and use of a cross-layer evaluation approach that extracts hardware-level errors from twenty different operating conditions and then inject such errors back to the software layer in an attempt to answer the second question posed above.

Araújo, D. R. B., Barros, G. H. P. S. de, Bastos-Filho, C. J. A., Martins-Filho, J. F..  2017.  Surrogate models assisted by neural networks to assess the resilience of networks. 2017 IEEE Latin American Conference on Computational Intelligence (LA-CCI). :1–6.

The assessment of networks is frequently accomplished by using time-consuming analysis tools based on simulations. For example, the blocking probability of networks can be estimated by Monte Carlo simulations and the network resilience can be assessed by link or node failure simulations. We propose in this paper to use Artificial Neural Networks (ANN) to predict the robustness of networks based on simple topological metrics to avoid time-consuming failure simulations. We accomplish the training process using supervised learning based on a historical database of networks. We compare the results of our proposal with the outcome provided by targeted and random failures simulations. We show that our approach is faster than failure simulators and the ANN can mimic the same robustness evaluation provide by these simulators. We obtained an average speedup of 300 times.

Alazzawe, A., Kant, K..  2017.  Slice Swarms for HPC Application Resilience. 2017 Fifth International Symposium on Computing and Networking (CANDAR). :1–10.

Resilience in High Performance Computing (HPC) is a constraining factor for bringing applications to the upcoming exascale systems. Resilience techniques must be able to scale to handle the increasing number of expected errors in an energy efficient manner. Since the purpose of running applications on HPC systems is to perform large scale computations as quick as possible, resilience methods should not add a large delay to the time to completion of the application. In this paper we introduce a novel technique to detect and recover from transient errors in HPC applications. One of the features of our technique is that the energy budget allocated to resilience can be adjusted depending on the operator's resilience needs. For example, on synthetic data, the technique can detect about 50% of transient errors while only using 20% of the dynamic energy required for running the application. For a 60% energy budget, an application that uses 10k cores and takes 128 hours to run, will only require 10% longer to complete.

WANG, YING, Li, Huawei, Li, Xiaowei.  2017.  Real-Time Meets Approximate Computing: An Elastic CNN Inference Accelerator with Adaptive Trade-off Between QoS and QoR. Proceedings of the 54th Annual Design Automation Conference 2017. :33:1–33:6.
Due to the recent progress in deep learning and neural acceleration architectures, specialized deep neural network or convolutional neural network (CNNs) accelerators are expected to provide an energy-efficient solution for real-time vision/speech processing. recognition and a wide spectrum of approximate computing applications. In addition to their wide applicability scope, we also found that the fascinating feature of deterministic performance and high energy-efficiency, makes such deep learning (DL) accelerators ideal candidates as application-processor IPs in embedded SoCs concerned with real-time processing. However, unlike traditional accelerator designs, DL accelerators introduce a new aspect of design trade-off between real-time processing (QoS) and computation approximation (QoR) into embedded systems. This work proposes an elastic CNN acceleration architecture that automatically adapts to the hard QoS constraint by exploiting the error-resilience in typical approximate computing workloads For the first time, the proposed design, including network tuning-and-mapping software and reconfigurable accelerator hardware, aims to reconcile the design constraint of QoS and Quality of Result (QoR). which are respectively the key concerns in real-time and approximate computing. It is shown in experiments that the proposed architecture enables the embedded system to work flexibly in an expanded operating space, significantly enhances its real-time ability. and maximizes the energy-efficiency of system within the user-specified QoS-QoR constraint through self-reconfiguration.
Li, Guanpeng, Hari, Siva Kumar Sastry, Sullivan, Michael, Tsai, Timothy, Pattabiraman, Karthik, Emer, Joel, Keckler, Stephen W..  2017.  Understanding Error Propagation in Deep Learning Neural Network (DNN) Accelerators and Applications. Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. :8:1–8:12.
Deep learning neural networks (DNNs) have been successful in solving a wide range of machine learning problems. Specialized hardware accelerators have been proposed to accelerate the execution of DNN algorithms for high-performance and energy efficiency. Recently, they have been deployed in datacenters (potentially for business-critical or industrial applications) and safety-critical systems such as self-driving cars. Soft errors caused by high-energy particles have been increasing in hardware systems, and these can lead to catastrophic failures in DNN systems. Traditional methods for building resilient systems, e.g., Triple Modular Redundancy (TMR), are agnostic of the DNN algorithm and the DNN accelerator's architecture. Hence, these traditional resilience approaches incur high overheads, which makes them challenging to deploy. In this paper, we experimentally evaluate the resilience characteristics of DNN systems (i.e., DNN software running on specialized accelerators). We find that the error resilience of a DNN system depends on the data types, values, data reuses, and types of layers in the design. Based on our observations, we propose two efficient protection techniques for DNN systems.
Jiang, Jun, Zhao, Xinghui, Wallace, Scott, Cotilla-Sanchez, Eduardo, Bass, Robert.  2017.  Mining PMU Data Streams to Improve Electric Power System Resilience. Proceedings of the Fourth IEEE/ACM International Conference on Big Data Computing, Applications and Technologies. :95–102.
Phasor measurement units (PMUs) provide high-fidelity situational awareness of electric power grid operations. PMU data are used in real-time to inform wide area state estimation, monitor area control error, and event detection. As PMU data becomes more reliable, these devices are finding roles within control systems such as demand response programs and early fault detection systems. As with other cyber physical systems, maintaining data integrity and security are significant challenges for power system operators. In this paper, we present a comprehensive study of multiple machine learning techniques for detecting malicious data injection within PMU data streams. The two datasets used in this study are from the Bonneville Power Administration's PMU network and an inter-university PMU network among three universities, located in the U.S. Pacific Northwest. These datasets contain data from both the transmission level and the distribution level. Our results show that both SVM and ANN are generally effective in detecting spoofed data, and TensorFlow, the newly released tool, demonstrates potential for distributing the training workload and achieving higher performance. We expect these results to shed light on future work of adopting machine learning and data analytics techniques in the electric power industry.
Chen, Yuanchang, Zhu, Yizhe, Qiao, Fei, Han, Jie, Liu, Yuansheng, Yang, Huazhong.  2017.  Evaluating Data Resilience in CNNs from an Approximate Memory Perspective. Proceedings of the on Great Lakes Symposium on VLSI 2017. :89–94.
Due to the large volumes of data that need to be processed, efficient memory access and data transmission are crucial for high-performance implementations of convolutional neural networks (CNNs). Approximate memory is a promising technique to achieve efficient memory access and data transmission in CNN hardware implementations. To assess the feasibility of applying approximate memory techniques, we propose a framework for the data resilience evaluation (DRE) of CNNs and verify its effectiveness on a suite of prevalent CNNs. Simulation results show that a high degree of data resilience exists in these networks. By scaling the bit-width of the first five dominant data subsets, the data volume can be reduced by 80.38% on average with a 2.69% loss in relative prediction accuracy. For approximate memory with random errors, all the synaptic weights can be stored in the approximate part when the error rate is less than 10–4, while 3 MSBs must be protected if the error rate is fixed at 10–3. These results indicate a great potential for exploiting approximate memory techniques in CNN hardware design.