# Biblio

Stealthy attackers often disable or tamper with system monitors to hide their tracks and evade detection. In this poster, we present a data-driven technique to detect such monitor compromise using evidential reasoning. Leveraging the fact that hiding from multiple, redundant monitors is difficult for an attacker, to identify potential monitor compromise, we combine alerts from different sets of monitors by using Dempster-Shafer theory, and compare the results to find outliers. We describe our ongoing work in this area.

Presented at the NSA Science of Security Quarterly Meeting, July 2016.

The concept of differential privacy stems from the study of private query of datasets. In this work, we apply this concept to discrete-time, linear distributed control systems in which agents need to maintain privacy of certain preferences, while sharing information for better system-level performance. The system has N agents operating in a shared environment that couples their dynamics. We show that for stable systems the performance grows as O(T3/Nε2), where T is the time horizon and ε is the differential privacy parameter. Next, we study lower-bounds in terms of the Shannon entropy of the minimal mean square estimate of the system’s private initial state from noisy communications between an agent and the server. We show that for any of noise-adding differentially private mechanism, then the Shannon entropy is at least nN(1−ln(ε/2)), where n is the dimension of the system, and t he lower bound is achieved by a Laplace-noise-adding mechanism. Finally, we study the problem of keeping the objective functions of individual agents differentially private in the context of cloud-based distributed optimization. The result shows a trade-off between the privacy of objective functions and the performance of the distributed optimization algorithm with noise.

Presented at the Joint Trust and Security/Science of Security Seminar, April 26, 2016.

Presented at the NSA Science of Security Quarterly Meeting, July 2016.

Best Poster Award, Illinois Institute of Technology Research Day, April 11, 2016.

Presented at the 11th European Dependable Computing Conference-Dependabiltiy in Practice (EDCC 2015), September 2015.

Datacenter-based Cloud computing has induced new disruptive trends in networking, key among which is network virtualization. Software-Defined Networking overlays aim to improve the efficiency of the next generation multitenant datacenters. While early overlay prototypes are already available, they focus mainly on core functionality, with little being known yet about their impact on the system level performance. Using query completion time as our primary performance metric, we evaluate the overlay network impact on two representative datacenter workloads, Partition/Aggregate and 3-Tier. We measure how much performance is traded for overlay's benefits in manageability, security and policing. Finally, we aim to assist the datacenter architects by providing a detailed evaluation of the key overlay choices, all made possible by our accurate cross-layer hybrid/mesoscale simulation platform.

Darknets, membership-concealing peer-to-peer networks, suffer from high message delivery delays due to insufficient routing strategies. They form topologies restricted to a subgraph of the social network of their users by limiting connections to peers with a mutual trust relationship in real life. Whereas centralized, highly successful social networking services entail a privacy loss of their users, Darknets at higher performance represent an optimal private and censorship-resistant communication substrate for social applications. Decentralized routing so far has been analyzed under the assumption that the network resembles a perfect lattice structure. Freenet, currently the only widely used Darknet, attempts to approximate this structure by embedding the social graph into a metric space. Considering the resulting distortion, the common greedy routing algorithm is adapted to account for local optima. Yet the impact of the adaptation has not been adequately analyzed. We thus suggest a model integrating inaccuracies in the embedding. In the context of this model, we show that the Freenet routing algorithm cannot achieve polylog performance. Consequently, we design NextBestOnce, a provable poylog algorithm based only on information about neighbors. Furthermore, we show that the routing length of NextBestOnce is further decreased by more than a constant factor if neighbor-of-neighbor information is included in the decision process.

Deadlock freedom is a key challenge in the design of communication networks. Wormhole switching is a popular switching technique, which is also prone to deadlocks. Deadlock analysis of routing functions is a manual and complex task. We propose an algorithm that automatically proves routing functions deadlock-free or outputs a minimal counter-example explaining the source of the deadlock. Our algorithm is the first to automatically check a necessary and sufficient condition for deadlock-free routing. We illustrate its efficiency in a complex adaptive routing function for torus topologies. Results are encouraging. Deciding deadlock freedom is co-NP-Complete for wormhole networks. Nevertheless, our tool proves a 13 × 13 torus deadlock-free within seconds. Finding minimal deadlocks is more difficult. Our tool needs four minutes to find a minimal deadlock in a 11 × 11 torus while it needs nine hours for a 12 × 12 network.

DeepQA is a large-scale natural language processing (NLP) question-and-answer system that responds across a breadth of structured and unstructured data, from hundreds of analytics that are combined with over 50 models, trained through machine learning. After the 2011 historic milestone of defeating the two best human players in the Jeopardy! game show, the technology behind IBM Watson, DeepQA, is undergoing gamification into real-world business problems. Gamifying a business domain for Watson is a composite of functional, content, and training adaptation for nongame play. During domain gamification for medical, financial, government, or any other business, each system change affects the machine-learning process. As opposed to the original Watson Jeopardy!, whose class distribution of positive-to-negative labels is 1:100, in adaptation the computed training instances, question-and-answer pairs transformed into true-false labels, result in a very low positive-to-negative ratio of 1:100 000. Such initial extreme class imbalance during domain gamification poses a big challenge for the Watson machine-learning pipelines. The combination of ingested corpus sets, question-and-answer pairs, configuration settings, and NLP algorithms contribute toward the challenging data state. We propose several data engineering techniques, such as answer key vetting and expansion, source ingestion, oversampling classes, and question set modifications to increase the computed true labels. In addition, algorithm engineering, such as an implementation of the Newton-Raphson logistic regression with a regularization term, relaxes the constraints of class imbalance during training adaptation. We conclude by empirically demonstrating that data and algorithm engineering are complementary and indispensable to overcome the challenges in this first Watson gamification for real-world business problems.

As wireless networks become more pervasive, the amount of the wireless data is rapidly increasing. One of the biggest challenges of wide adoption of distributed data storage is how to store these data securely. In this work, we study the frequency-based attack, a type of attack that is different from previously well-studied ones, that exploits additional adversary knowledge of domain values and/or their exact/approximate frequencies to crack the encrypted data. To cope with frequency-based attacks, the straightforward 1-to-1 substitution encryption functions are not sufficient. We propose a data encryption strategy based on 1-to-n substitution via dividing and emulating techniques to defend against the frequency-based attack, while enabling efficient query evaluation over encrypted data. We further develop two frameworks, incremental collection and clustered collection, which are used to defend against the global frequency-based attack when the knowledge of the global frequency in the network is not available. Built upon our basic encryption schemes, we derive two mechanisms, direct emulating and dual encryption, to handle updates on the data storage for energy-constrained sensor nodes and wireless devices. Our preliminary experiments with sensor nodes and extensive simulation results show that our data encryption strategy can achieve high security guarantee with low overhead.

Many common cyberdefenses (like firewalls and intrusion-detection systems) are static, giving attackers the freedom to probe them at will. Moving-target defense (MTD) adds dynamism, putting the systems to be defended in motion, potentially at great cost to the defender. An alternative approach is a mobile resilient defense that removes attackers' ability to rely on prior experience without requiring motion in the protected infrastructure. The defensive technology absorbs most of the cost of motion, is resilient to attack, and is unpredictable to attackers. The authors' mobile resilient defense, Ant-Based Cyber Defense (ABCD), is a set of roaming, bio-inspired, digital-ant agents working with stationary agents in a hierarchy headed by a human supervisor. ABCD provides a resilient, extensible, and flexible defense that can scale to large, multi-enterprise infrastructures such as the smart electric grid.

A significant advance in magnetic field management in a fully assembled superconducting radiofrequency cryomodule has been achieved and is reported here. Demagnetization of the entire cryomodule after assembly is a crucial step toward the goal of average magnetic flux density less than 0.5 μT at the location of the superconducting radio frequency cavities. An explanation of the physics of demagnetization and experimental results are presented.

Nearest neighbor search is a fundamental problem in various research fields like machine learning, data mining and pattern recognition. Recently, hashing-based approaches, for example, locality sensitive hashing (LSH), are proved to be effective for scalable high dimensional nearest neighbor search. Many hashing algorithms found their theoretic root in random projection. Since these algorithms generate the hash tables (projections) randomly, a large number of hash tables (i.e., long codewords) are required in order to achieve both high precision and recall. To address this limitation, we propose a novel hashing algorithm called density sensitive hashing (DSH) in this paper. DSH can be regarded as an extension of LSH. By exploring the geometric structure of the data, DSH avoids the purely random projections selection and uses those projective functions which best agree with the distribution of the data. Extensive experimental results on real-world data sets have shown that the proposed method achieves better performance compared to the state-of-the-art hashing approaches.

Engineering complex distributed systems is challenging. Recent solutions for the development of cyber-physical systems (CPS) in industry tend to rely on architectural designs based on service orientation, where the constituent components are deployed according to their service behavior and are to be understood as loosely coupled and mostly independent. In this paper, we develop a workflow that combines contract-based and CPS model-based specifications with service orientation, and analyze the resulting model using fault injection to assess the dependability of the systems. Compositionality principles based on the contract specification help us to make the analysis practical. The presented techniques are evaluated on two case studies.

Demand ResponseManagement (DRM) is a key component in the smart grid to effectively reduce power generation costs and user bills. However, it has been an open issue to address the DRM problem in a network of multiple utility companies and consumers where every entity is concerned about maximizing its own benefit. In this paper, we propose a Stackelberg game between utility companies and end-users to maximize the revenue of each utility company and the payoff of each user. We derive analytical results for the Stackelberg equilibrium of the game and prove that a unique solution exists.We develop a distributed algorithm which converges to the equilibrium with only local information available for both utility companies and end-users. Though DRM helps to facilitate the reliability of power supply, the smart grid can be succeptible to privacy and security issues because of communication links between the utility companies and the consumers. We study the impact of an attacker who can manipulate the price information from the utility companies.We also propose a scheme based on the concept of shared reserve power to improve the grid reliability and ensure its dependability.

This paper is to design substitution boxes (S-Boxes) using innovative I-Ching operators (ICOs) that have evolved from ancient Chinese I-Ching philosophy. These three operators-intrication, turnover, and mutual- inherited from I-Ching are specifically designed to generate S-Boxes in cryptography. In order to analyze these three operators, identity, compositionality, and periodicity measures are developed. All three operators are only applied to change the output positions of Boolean functions. Therefore, the bijection property of S-Box is satisfied automatically. It means that our approach can avoid singular values, which is very important to generate S-Boxes. Based on the periodicity property of the ICOs, a new network is constructed, thus to be applied in the algorithm for designing S-Boxes. To examine the efficiency of our proposed approach, some commonly used criteria are adopted, such as nonlinearity, strict avalanche criterion, differential approximation probability, and linear approximation probability. The comparison results show that S-Boxes designed by applying ICOs have a higher security and better performance compared with other schemes. Furthermore, the proposed approach can also be used to other practice problems in a similar way.

The optimal design of a fault-tolerant quantum computer involves finding an appropriate balance between the burden of large-scale integration of noisy components and the load of improving the reliability of hardware technology. This balance can be evaluated by quantitatively modeling the execution of quantum logic operations on a realistic quantum hardware containing limited computational resources. In this work, we report a complete performance simulation software tool capable of (1) searching the hardware design space by varying resource architecture and technology parameters, (2) synthesizing and scheduling a fault-tolerant quantum algorithm within the hardware constraints, (3) quantifying the performance metrics such as the execution time and the failure probability of the algorithm, and (4) analyzing the breakdown of these metrics to highlight the performance bottlenecks and visualizing resource utilization to evaluate the adequacy of the chosen design. Using this tool, we investigate a vast design space for implementing key building blocks of Shor’s algorithm to factor a 1,024-bit number with a baseline budget of 1.5 million qubits. We show that a trapped-ion quantum computer designed with twice as many qubits and one-tenth of the baseline infidelity of the communication channel can factor a 2,048-bit integer in less than 5 months.

By exploiting the communication infrastructure among the sensors, actuators, and control systems, attackers may compromise the security of smart-grid systems, with techniques such as denial-of-service (DoS) attack, random attack, and data-injection attack. In this paper, we present a mathematical model of the system to study these pitfalls and propose a robust security framework for the smart grid. Our framework adopts the Kalman filter to estimate the variables of a wide range of state processes in the model. The estimates from the Kalman filter and the system readings are then fed into the χ2-detector or the proposed Euclidean detector. The χ2-detector is a proven effective exploratory method used with the Kalman filter for the measurement of the relationship between dependent variables and a series of predictor variables. The χ2-detector can detect system faults/attacks, such as DoS attack, short-term, and long-term random attacks. However, the studies show that the χ2-detector is unable to detect the statistically derived false data-injection attack. To overcome this limitation, we prove that the Euclidean detector can effectively detect such a sophisticated injection attack.