Visible to the public Biblio

Filters: Author is Fei, Yunsi  [Clear All Filters]
Luo, Yukui, Gongye, Cheng, Ren, Shaolei, Fei, Yunsi, Xu, Xiaolin.  2020.  Stealthy-Shutdown: Practical Remote Power Attacks in Multi - Tenant FPGAs. 2020 IEEE 38th International Conference on Computer Design (ICCD). :545–552.
With the deployment of artificial intelligent (AI) algorithms in a large variety of applications, there creates an increasing need for high-performance computing capabilities. As a result, different hardware platforms have been utilized for acceleration purposes. Among these hardware-based accelerators, the field-programmable gate arrays (FPGAs) have gained a lot of attention due to their re-programmable characteristics, which provide customized control logic and computing operators. For example, FPGAs have recently been adopted for on-demand cloud services by the leading cloud providers like Amazon and Microsoft, providing acceleration for various compute-intensive tasks. While the co-residency of multiple tenants on a cloud FPGA chip increases the efficiency of resource utilization, it also creates unique attack surfaces that are under-explored. In this paper, we exploit the vulnerability associated with the shared power distribution network on cloud FPGAs. We present a stealthy power attack that can be remotely launched by a malicious tenant, shutting down the entire chip and resulting in denial-of-service for other co-located benign tenants. Specifically, we propose stealthy-shutdown: a well-timed power attack that can be implemented in two steps: (1) an attacker monitors the realtime FPGA power-consumption detected by ring-oscillator-based voltage sensors, and (2) when capturing high power-consuming moments, i.e., the power consumption by other tenants is above a certain threshold, she/he injects a well-timed power load to shut down the FPGA system. Note that in the proposed attack strategy, the power load injected by the attacker only accounts for a small portion of the overall power consumption; therefore, such attack strategy remains stealthy to the cloud FPGA operator. We successfully implement and validate the proposed attack on three FPGA evaluation kits with running real-world applications. The proposed attack results in a stealthy-shutdown, demonstrating severe security concerns of co-tenancy on cloud FPGAs. We also offer two countermeasures that can mitigate such power attacks.
Sabbagh, Majid, Gongye, Cheng, Fei, Yunsi, Wang, Yanzhi.  2019.  Evaluating Fault Resiliency of Compressed Deep Neural Networks. 2019 IEEE International Conference on Embedded Software and Systems (ICESS). :1–7.

Model compression is considered to be an effective way to reduce the implementation cost of deep neural networks (DNNs) while maintaining the inference accuracy. Many recent studies have developed efficient model compression algorithms and implementations in accelerators on various devices. Protecting integrity of DNN inference against fault attacks is important for diverse deep learning enabled applications. However, there has been little research investigating the fault resilience of DNNs and the impact of model compression on fault tolerance. In this work, we consider faults on different data types and develop a simulation framework for understanding the fault resiliency of compressed DNN models as compared to uncompressed models. We perform our experiments on two common DNNs, LeNet-5 and VGG16, and evaluate their fault resiliency with different types of compression. The results show that binary quantization can effectively increase the fault resilience of DNN models by 10000x for both LeNet5 and VGG16. Finally, we propose software and hardware mitigation techniques to increase the fault resiliency of DNN models.

Luo, Chao, Fei, Yunsi, Kaeli, David.  2019.  Side-Channel Timing Attack of RSA on a GPU. ACM Transactions on Architecture and Code Optimization (TACO). 16:32:1-32:18.
To increase computation throughput, general purpose Graphics Processing Units (GPUs) have been leveraged to accelerate computationally intensive workloads. GPUs have been used as cryptographic engines, improving encryption/decryption throughput and leveraging the GPU's Single Instruction Multiple Thread (SIMT) model. RSA is a widely used public-key cipher and has been ported onto GPUs for signing and decrypting large files. Although performance has been significantly improved, the security of RSA on GPUs is vulnerable to side-channel timing attacks and is an exposure overlooked in previous studies. GPUs tend to be naturally resilient to side-channel attacks, given that they execute a large number of concurrent threads, performing many RSA operations on different data in parallel. Given the degree of parallel execution on a GPU, there will be a significant amount of noise introduced into the timing channel given the thousands of concurrent threads executing concurrently. In this work, we build a timing model to capture the parallel characteristics of an RSA public-key cipher implemented on a GPU. We consider optimizations that include using Montgomery multiplication and sliding-window exponentiation to implement cryptographic operations. Our timing model considers the challenges of parallel execution, complications that do not occur in single-threaded computing platforms. Based on our timing model, we launch successful timing attacks on RSA running on a GPU, extracting the private key of RSA. We also present an effective error detection and correction mechanism. Our results demonstrate that GPU acceleration of RSA is vulnerable to side-channel timing attacks. We propose several countermeasures to defend against this class of attacks.
Luo, Chao, Fei, Yunsi, Kaeli, David.  2018.  Effective Simple-power Analysis Attacks of Elliptic Curve Cryptography on Embedded Systems. Proceedings of the International Conference on Computer-Aided Design. :115:1–115:7.
Elliptic Curve Cryptography (ECC), initially proposed by Koblitz [17] and Miller [20], is a public-key cipher. Compared with other popular public-key ciphers (e.g., RSA), ECC features a shorter key length for the same level of security. For example, a 256-bit ECC cipher provides 128-bit security, equivalent to a 2048-bit RSA cipher [4]. Using smaller keys, ECC requires less memory for performing cryptographic operations. Embedded systems, especially given the proliferation of Internet-of-Things (IoT) devices and platforms, require efficient and low-power secure communications between edge devices and gateways/clouds. ECC has been widely adopted in IoT systems for authentication of communications, while RSA, which is much more costly to compute, remains the standard for desktops and servers.
Luo, Pei, Li, Cheng, Fei, Yunsi.  2016.  Concurrent Error Detection for Reliable SHA-3 Design. Proceedings of the 26th Edition on Great Lakes Symposium on VLSI. :39–44.

Cryptographic systems are vulnerable to random errors and injected faults. Soft errors can inadvertently happen in critical cryptographic modules and attackers can inject faults into systems to retrieve the embedded secret. Different schemes have been developed to improve the security and reliability of cryptographic systems. As the new SHA-3 standard, Keccak algorithm will be widely used in various cryptographic applications, and its implementation should be protected against random errors and injected faults. In this paper, we devise different parity checking methods to protect the operations of Keccak. Results show that our schemes can be easily implemented and can effectively protect Keccak system against random errors and fault attacks.