Visible to the public Biblio

Filters: Author is Jha, Somesh  [Clear All Filters]
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 
Wu, Xi, Li, Fengan, Kumar, Arun, Chaudhuri, Kamalika, Jha, Somesh, Naughton, Jeffrey.  2017.  Bolt-on Differential Privacy for Scalable Stochastic Gradient Descent-based Analytics. Proceedings of the 2017 ACM International Conference on Management of Data. :1307–1322.

While significant progress has been made separately on analytics systems for scalable stochastic gradient descent (SGD) and private SGD, none of the major scalable analytics frameworks have incorporated differentially private SGD. There are two inter-related issues for this disconnect between research and practice: (1) low model accuracy due to added noise to guarantee privacy, and (2) high development and runtime overhead of the private algorithms. This paper takes a first step to remedy this disconnect and proposes a private SGD algorithm to address both issues in an integrated manner. In contrast to the white-box approach adopted by previous work, we revisit and use the classical technique of output perturbation to devise a novel “bolt-on” approach to private SGD. While our approach trivially addresses (2), it makes (1) even more challenging. We address this challenge by providing a novel analysis of the L2-sensitivity of SGD, which allows, under the same privacy guarantees, better convergence of SGD when only a constant number of passes can be made over the data. We integrate our algorithm, as well as other state-of-the-art differentially private SGD, into Bismarck, a popular scalable SGD-based analytics system on top of an RDBMS. Extensive experiments show that our algorithm can be easily integrated, incurs virtually no overhead, scales well, and most importantly, yields substantially better (up to 4X) test accuracy than the state-of-the-art algorithms on many real datasets.

Sun, Zhichuang, Feng, Bo, Lu, Long, Jha, Somesh.  2020.  OAT: Attesting Operation Integrity of Embedded Devices. 2020 IEEE Symposium on Security and Privacy (SP). :1433—1449.

Due to the wide adoption of IoT/CPS systems, embedded devices (IoT frontends) become increasingly connected and mission-critical, which in turn has attracted advanced attacks (e.g., control-flow hijacks and data-only attacks). Unfortunately, IoT backends (e.g., remote controllers or in-cloud services) are unable to detect if such attacks have happened while receiving data, service requests, or operation status from IoT devices (remotely deployed embedded devices). As a result, currently, IoT backends are forced to blindly trust the IoT devices that they interact with.To fill this void, we first formulate a new security property for embedded devices, called "Operation Execution Integrity" or OEI. We then design and build a system, OAT, that enables remote OEI attestation for ARM-based bare-metal embedded devices. Our formulation of OEI captures the integrity of both control flow and critical data involved in an operation execution. Therefore, satisfying OEI entails that an operation execution is free of unexpected control and data manipulations, which existing attestation methods cannot check. Our design of OAT strikes a balance between prover's constraints (embedded devices' limited computing power and storage) and verifier's requirements (complete verifiability and forensic assistance). OAT uses a new control-flow measurement scheme, which enables lightweight and space-efficient collection of measurements (97% space reduction from the trace-based approach). OAT performs the remote control-flow verification through abstract execution, which is fast and deterministic. OAT also features lightweight integrity checking for critical data (74% less instrumentation needed than previous work). Our security analysis shows that OAT allows remote verifiers or IoT backends to detect both controlflow hijacks and data-only attacks that affect the execution of operations on IoT devices. In our evaluation using real embedded programs, OAT incurs a runtime overhead of 2.7%.

Jang, Uyeong, Wu, Xi, Jha, Somesh.  2017.  Objective Metrics and Gradient Descent Algorithms for Adversarial Examples in Machine Learning. Proceedings of the 33rd Annual Computer Security Applications Conference. :262–277.
Fueled by massive amounts of data, models produced by machine-learning (ML) algorithms are being used in diverse domains where security is a concern, such as, automotive systems, finance, health-care, computer vision, speech recognition, natural-language processing, and malware detection. Of particular concern is use of ML in cyberphysical systems, such as driver-less cars and aviation, where the presence of an adversary can cause serious consequences. In this paper we focus on attacks caused by adversarial samples, which are inputs crafted by adding small, often imperceptible, perturbations to force a ML model to misclassify. We present a simple gradient-descent based algorithm for finding adversarial samples, which performs well in comparison to existing algorithms. The second issue that this paper tackles is that of metrics. We present a novel metric based on few computer-vision algorithms for measuring the quality of adversarial samples.
Papernot, Nicolas, McDaniel, Patrick, Goodfellow, Ian, Jha, Somesh, Celik, Z. Berkay, Swami, Ananthram.  2017.  Practical Black-Box Attacks Against Machine Learning. Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security. :506–519.

Machine learning (ML) models, e.g., deep neural networks (DNNs), are vulnerable to adversarial examples: malicious inputs modified to yield erroneous model outputs, while appearing unmodified to human observers. Potential attacks include having malicious content like malware identified as legitimate or controlling vehicle behavior. Yet, all existing adversarial example attacks require knowledge of either the model internals or its training data. We introduce the first practical demonstration of an attacker controlling a remotely hosted DNN with no such knowledge. Indeed, the only capability of our black-box adversary is to observe labels given by the DNN to chosen inputs. Our attack strategy consists in training a local model to substitute for the target DNN, using inputs synthetically generated by an adversary and labeled by the target DNN. We use the local substitute to craft adversarial examples, and find that they are misclassified by the targeted DNN. To perform a real-world and properly-blinded evaluation, we attack a DNN hosted by MetaMind, an online deep learning API. We find that their DNN misclassifies 84.24% of the adversarial examples crafted with our substitute. We demonstrate the general applicability of our strategy to many ML techniques by conducting the same attack against models hosted by Amazon and Google, using logistic regression substitutes. They yield adversarial examples misclassified by Amazon and Google at rates of 96.19% and 88.94%. We also find that this black-box attack strategy is capable of evading defense strategies previously found to make adversarial example crafting harder.

Cormode, Graham, Jha, Somesh, Kulkarni, Tejas, Li, Ninghui, Srivastava, Divesh, Wang, Tianhao.  2018.  Privacy at Scale: Local Differential Privacy in Practice. Proceedings of the 2018 International Conference on Management of Data. :1655–1658.
Local differential privacy (LDP), where users randomly perturb their inputs to provide plausible deniability of their data without the need for a trusted party, has been adopted recently by several major technology organizations, including Google, Apple and Microsoft. This tutorial aims to introduce the key technical underpinnings of these deployed systems, to survey current research that addresses related problems within the LDP model, and to identify relevant open problems and research directions for the community.
Davidson, Drew, Chen, Yaohui, George, Franklin, Lu, Long, Jha, Somesh.  2017.  Secure Integration of Web Content and Applications on Commodity Mobile Operating Systems. Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security. :652–665.

A majority of today's mobile apps integrate web content of various kinds. Unfortunately, the interactions between app code and web content expose new attack vectors: a malicious app can subvert its embedded web content to steal user secrets; on the other hand, malicious web content can use the privileges of its embedding app to exfiltrate sensitive information such as the user's location and contacts. In this paper, we discuss security weaknesses of the interface between app code and web content through attacks, then introduce defenses that can be deployed without modifying the OS. Our defenses feature WIREframe, a service that securely embeds and renders external web content in Android apps, and in turn, prevents attacks between em- bedded web and host apps. WIREframe fully mediates the interface between app code and embedded web content. Un- like the existing web-embedding mechanisms, WIREframe allows both apps and embedded web content to define simple access policies to protect their own resources. These policies recognize fine-grained security principals, such as origins, and control all interactions between apps and the web. We also introduce WIRE (Web Isolation Rewriting Engine), an offline app rewriting tool that allows app users to inject WIREframe protections into existing apps. Our evaluation, based on 7166 popular apps and 20 specially selected apps, shows these techniques work on complex apps and incur acceptable end-to-end performance overhead.

Chen, Jiefeng, Wu, Xi, Rastogi, Vaibhav, Liang, Yingyu, Jha, Somesh.  2019.  Towards Understanding Limitations of Pixel Discretization Against Adversarial Attacks. 2019 IEEE European Symposium on Security and Privacy (EuroS P). :480–495.

Wide adoption of artificial neural networks in various domains has led to an increasing interest in defending adversarial attacks against them. Preprocessing defense methods such as pixel discretization are particularly attractive in practice due to their simplicity, low computational overhead, and applicability to various systems. It is observed that such methods work well on simple datasets like MNIST, but break on more complicated ones like ImageNet under recently proposed strong white-box attacks. To understand the conditions for success and potentials for improvement, we study the pixel discretization defense method, including more sophisticated variants that take into account the properties of the dataset being discretized. Our results again show poor resistance against the strong attacks. We analyze our results in a theoretical framework and offer strong evidence that pixel discretization is unlikely to work on all but the simplest of the datasets. Furthermore, our arguments present insights why some other preprocessing defenses may be insecure.