Visible to the public Biblio

Found 15375 results

Christopher Theisen, Brendan Murphy, Kim Herzig, Laurie Williams.  Submitted.  Risk-Based Attack Surface Approximation: How Much Data is Enough? International Conference on Software Engineering (ICSE) Software Engineering in Practice (SEIP) 2017.

Proactive security reviews and test efforts are a necessary component of the software development lifecycle. Resource limitations often preclude reviewing the entire code
base. Making informed decisions on what code to review can improve a team’s ability to find and remove vulnerabilities. Risk-based attack surface approximation (RASA) is a technique that uses crash dump stack traces to predict what code may contain exploitable vulnerabilities. The goal of this research is to help software development teams prioritize security efforts by the efficient development of a risk-based attack surface approximation. We explore the use of RASA using Mozilla Firefox and Microsoft Windows stack traces from crash dumps. We create RASA at the file level for Firefox, in which the 15.8% of the files that were part of the approximation contained 73.6% of the vulnerabilities seen for the product. We also explore the effect of random sampling of crashes on the approximation, as it may be impractical for organizations to store and process every crash received. We find that 10-fold random sampling of crashes at a rate of 10% resulted in 3% less vulnerabilities identified than using the entire set of stack traces for Mozilla Firefox. Sampling crashes in Windows 8.1 at a rate of 40% resulted in insignificant differences in vulnerability and file coverage as compared to a rate of 100%.

Ashiq Rahman, Ehab Al-Shaer.  Submitted.  Automated Synthesis of Resilient Network Access Controls: A Formal Framework with Refinement. IEEE Transactions of Parallel and Distributed Computing (TPDC),.

Due to the extensive use of network services and emerging security threats, enterprise networks deploy varieties of security devices for controlling resource access based on organizational security requirements. These requirements need fine-grained access control rules based on heterogeneous isolation patterns like access denial, trusted communication, and payload inspection. Organizations are also seeking for usable and optimal security configurations that can harden the network security within enterprise budget constraints. In order to design a security architecture, i.e., the distribution of security devices along with their security policies, that satisfies the organizational security requirements as well as the business constraints, it is required to analyze various alternative security architectures considering placements of network security devices in the network and the corresponding access controls. In this paper, we present an automated formal framework for synthesizing network security configurations. The main design alternatives include different kinds of isolation patterns for network traffic flows. The framework takes security requirements and business constraints along with the network topology as inputs. Then, it synthesizes cost-effective security configurations satisfying the constraints and provides placements of different security devices, optimally distributed in the network, according to the given network topology. In addition, we provide a hypothesis testing-based security architecture refinement mechanism that explores various security design alternatives using ConfigSynth and improves the security architecture by systematically increasing the security requirements. We demonstrate the execution of ConfigSynth and the refinement mechanism using case studies. Finally, we evaluate their scalability using simulated experiments.

[Anonymous].  Submitted.  Natural Language Processing Characterization of Recurring Calls in Public Security Services.
Extracting knowledge from unstructured data silos, a legacy of old applications, is mandatory for improving the governance of today's cities and fostering the creation of smart cities. Texts in natural language often compose such data. Nevertheless, the inference of useful information from a linguistic-computational analysis of natural language data is an open challenge. In this paper, we propose a clustering method to analyze textual data employing the unsupervised machine learning algorithms k-means and hierarchical clustering. We assess different vector representation methods for text, similarity metrics, and the number of clusters that best matches the data. We evaluate the methods using a real database of a public record service of security occurrences. The results show that the k-means algorithm using Euclidean distance extracts non-trivial knowledge, reaching up to 93% accuracy in a set of test samples while identifying the 12 most prevalent occurrence patterns.
In Press
Ignacio X. Dominguez, Jayant Dhawan, Robert St. Amant, David L. Roberts.  In Press.  Exploring the Effects of Different Text Stimuli on Typing Behavior. International Conference on Cognitive Modeling.

In this work we explore how different cognitive processes af- fected typing patterns through a computer game we call The Typing Game. By manipulating the players’ familiarity with the words in our game through their similarity to dictionary words, and by allowing some players to replay rounds, we found that typing speed improves with familiarity with words, and also with practice, but that these are independent of the number of mistakes that are made when typing. We also found that users who had the opportunity to replay rounds exhibited different typing patterns even before replaying the rounds. 

Welk, A., Zielinska, O., Tembe, R., Xe, G., Hong, K. W., Murphy-Hill, E., Mayhorn, C. B..  In Press.  Will the “Phisher-men” Reel you in? Assessing Individual Differences in a Phishing Detection Task International Journal of Cyber Behavior, Psychology, and Learning. .

Phishing is an act of technology-based deception that targets individuals to obtain information. To minimize the number of phishing attacks, factors that influence the ability to identify phishing attempts must be examined. The present study aimed to determine how individual differences relate to performance on a phishing task. Undergraduate students completed a questionnaire designed to assess impulsivity, trust, personality characteristics, and Internet/security habits. Participants performed an email task where they had to discriminate between legitimate emails and phishing attempts. Researchers assessed performance in terms of correctly identifying all email types (overall accuracy) as well as accuracy in identifying phishing emails (phishing accuracy). Results indicated that overall and phishing accuracy each possessed unique trust, personality, and impulsivity predictors, but shared one significant behavioral predictor. These results present distinct predictors of phishing susceptibility that should be incorporated in the development of anti-phishing technology and training.

Choucri, Nazli, Agarwal, Gaurav.  2022.  International Law for Cyber Operations: Networks, Complexity, Transparency. MIT Political Science Network. :1-38.
Policy documents are usually written in text form—word after word, sentence after sentence, page after page, section after section, chapter after chapter—which often masks some of their most critical features. The text form cannot easily show interconnections among elements, identify the relative salience of issues, or represent feedback dynamics, for example. These are “hidden” features that are difficult to situate. This paper presents a computational analysis of Tallinn Manual 2.0 on the International Law Applicable to Cyber Operations, a seminal work in International Law. Tallinn Manual 2.0 is a seminal document for many reasons, including but not limited to, its (a) authoritative focus on cyber operations, (b) foundation in the fundamental legal principles of the international order and (c) direct relevance to theory, practice, and policy in international relations. The results identify the overwhelming dominance of specific Rules, the centrality of select Rules, the Rules with autonomous standing (that is, not connected to the rest of the corpus), and highlight different aspects of Tallinn Manual 2.0, notably situating authority, security of information -- the feedback structure that keeps the pieces together. This study serves as a “proof of concept” for the use of computational logics to enhance our understanding of policy documents.
Nazli Choucri, Agarwal Gaurav.  2022.  CyberIR@MIT: Knowledge for Science Policy & Practice.
CyberIR@MIT is a dynamic, interactive ontology-based knowledge system focused on the evolving, diverse & complex interconnections of cyberspace & international relations.
Sardar, Muhammad, Fetzer, Christof.  2022.  Formal Foundations for SCONE attestation and Intel SGX Data Center Attestation Primitives.
One of the essential features of confidential computing is the ability to attest to an application remotely. Remote attestation ensures that the right code is running in the correct environment. We need to ensure that all components that an adversary might use to impact the integrity, confidentiality, and consistency of an application are attested. Which components need to be attested is defined with the help of a policy. Verification of the policy is performed with the help of an attestation engine. Since remote attestation bootstraps the trust in remote applications, any vulnerability in the attestation mechanism can therefore impact the security of an application. Moreover, mistakes in the attestation policy can result in data, code, and secrets being vulnerable. Our work focuses on 1) how we can verify the attestation mechanisms and 2) how to verify the policy to ensure that data, code, and secrets are always protected.
Kara, Mustafa, \c Sanlıöz, \c Sevki Gani, Merzeh, Hisham R. J., Aydın, Muhammed Ali, Balık, Hasan Hüseyin.  2021.  Blockchain Based Mutual Authentication for VoIP Applications with Biometric Signatures. 2021 6th International Conference on Computer Science and Engineering (UBMK). :133–138.

In this study, a novel decentralized authentication model is proposed for establishing a secure communications structure in VoIP applications. The proposed scheme considers a distributed architecture called the blockchain. With this scheme, we highlight the multimedia data is more resistant to some of the potential attacks according to the centralized architecture. Our scheme presents the overall system authentication architecture, and it is suitable for mutual authentication in terms of privacy and anonymity. We construct an ECC-based model in the encryption infrastructure because our structure is time-constrained during communications. This study differs from prior work in that blockchain platforms with ECC-Based Biometric Signature. We generate a biometric key for creating a unique ID value with ECC to verify the caller and device authentication together in blockchain. We validated the proposed model by comparing with the existing method in VoIP application used centralized architecture.

Feng, Ling, Feng, Bin, Zhang, Lei, Duan, XiQiang.  2021.  Design of an Authorized Digital Signature Scheme for Sensor Network Communication in Secure Internet of Things. 2021 3rd International Symposium on Robotics Intelligent Manufacturing Technology (ISRIMT). :496–500.

With the rapid development of Internet of Things technology and sensor networks, large amount of data is facing security challenges in the transmission process. In the process of data transmission, the standardization and authentication of data sources are very important. A digital signature scheme based on bilinear pairing problem is designed. In this scheme, by signing the authorization mechanism, the management node can control the signature process and distribute data. The use of private key segmentation mechanism can reduce the performance requirements of sensor nodes. The reasonable combination of timestamp mechanism can ensure the time limit of signature and be verified after the data is sent. It is hoped that the implementation of this scheme can improve the security of data transmission on the Internet of things environment.

Mohan, K. Madan, Yadav, B V Ram Naresh.  2021.  Dynamic Graph Based Encryption Scheme for Cloud Based Services and Storage. 2021 9th International Conference on Cyber and IT Service Management (CITSM). :1—4.

Cloud security includes the strategies which works together to guard data and infrastructure with a set of policies, procedures, controls and technologies. These security events are arranged to protect cloud data, support supervisory obedience and protect customers' privacy as well as setting endorsement rules for individual users and devices. The partition-based handling and encryption mechanism which provide fine-grained admittance control and protected data sharing to the data users in cloud computing. Graph partition problems fall under the category of NP-hard problems. Resolutions to these problems are generally imitative using heuristics and approximation algorithms. Partition problems strategy is used in bi-criteria approximation or resource augmentation approaches with a common extension of hyper graphs, which can address the storage hierarchy.

Shi, Jibo, Lin, Yun, Zhang, Zherui, Yu, Shui.  2021.  A Hybrid Intrusion Detection System Based on Machine Learning under Differential Privacy Protection. 2021 IEEE 94th Vehicular Technology Conference (VTC2021-Fall). :1–6.

With the development of network, network security has become a topic of increasing concern. Recent years, machine learning technology has become an effective means of network intrusion detection. However, machine learning technology requires a large amount of data for training, and training data often contains privacy information, which brings a great risk of privacy leakage. At present, there are few researches on data privacy protection in the field of intrusion detection. Regarding the issue of privacy and security, we combine differential privacy and machine learning algorithms, including One-class Support Vector Machine (OCSVM) and Local Outlier Factor(LOF), to propose an hybrid intrusion detection system (IDS) with privacy protection. We add Laplacian noise to the original network intrusion detection data set to get differential privacy data sets with different privacy budgets, and proposed a hybrid IDS model based on machine learning to verify their utility. Experiments show that while protecting data privacy, the hybrid IDS can achieve detection accuracy comparable to traditional machine learning algorithms.

Rodigari, Simone, O'Shea, Donna, McCarthy, Pat, McCarry, Martin, McSweeney, Sean.  2021.  Performance Analysis of Zero-Trust Multi-Cloud. 2021 IEEE 14th International Conference on Cloud Computing (CLOUD). :730–732.
Zero Trust security model permits to secure cloud native applications while encrypting all network communication, authenticating, and authorizing every request. The service mesh can enable Zero Trust using a side-car proxy without changes to the application code. To the best of our knowledge, no previous work has provided a performance analysis of Zero Trust in a multi-cloud environment. This paper proposes a multi-cloud framework and a testing workflow to analyse performance of the data plane under load and the impact on the control plane, when Zero Trust is enabled. The results of preliminary tests show that Istio has reduced latency variability in responding to sequential HTTP requests. Results also reveal that the overall CPU and memory usage can increase based on service mesh configuration and the cloud environment.
Loya, Jatan, Bana, Tejas.  2021.  Privacy-Preserving Keystroke Analysis using Fully Homomorphic Encryption amp; Differential Privacy. 2021 International Conference on Cyberworlds (CW). :291–294.

Keystroke dynamics is a behavioural biometric form of authentication based on the inherent typing behaviour of an individual. While this technique is gaining traction, protecting the privacy of the users is of utmost importance. Fully Homomorphic Encryption is a technique that allows performing computation on encrypted data, which enables processing of sensitive data in an untrusted environment. FHE is also known to be “future-proof” since it is a lattice-based cryptosystem that is regarded as quantum-safe. It has seen significant performance improvements over the years with substantially increased developer-friendly tools. We propose a neural network for keystroke analysis trained using differential privacy to speed up training while preserving privacy and predicting on encrypted data using FHE to keep the users' privacy intact while offering sufficient usability.

Feng, Tianyi, Zhang, Zhixiang, Wong, Wai-Choong, Sun, Sumei, Sikdar, Biplab.  2021.  A Privacy-Preserving Pedestrian Dead Reckoning Framework Based on Differential Privacy. 2021 IEEE 32nd Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC). :1487–1492.

Pedestrian dead reckoning (PDR) is a widely used approach to estimate locations and trajectories. Accessing location-based services with trajectory data can bring convenience to people, but may also raise privacy concerns that need to be addressed. In this paper, a privacy-preserving pedestrian dead reckoning framework is proposed to protect a user’s trajectory privacy based on differential privacy. We introduce two metrics to quantify trajectory privacy and data utility. Our proposed privacy-preserving trajectory extraction algorithm consists of three mechanisms for the initial locations, stride lengths and directions. In addition, we design an adversary model based on particle filtering to evaluate the performance and demonstrate the effectiveness of our proposed framework with our collected sensor reading dataset.

Pisharody, Sandeep, Bernays, Jonathan, Gadepally, Vijay, Jones, Michael, Kepner, Jeremy, Meiners, Chad, Michaleas, Peter, Tse, Adam, Stetson, Doug.  2021.  Realizing Forward Defense in the Cyber Domain. 2021 IEEE High Performance Extreme Computing Conference (HPEC). :1–7.

With the recognition of cyberspace as an operating domain, concerted effort is now being placed on addressing it in the whole-of-domain manner found in land, sea, undersea, air, and space domains. Among the first steps in this effort is applying the standard supporting concepts of security, defense, and deterrence to the cyber domain. This paper presents an architecture that helps realize forward defense in cyberspace, wherein adversarial actions are repulsed as close to the origin as possible. However, substantial work remains in making the architecture an operational reality including furthering fundamental research cyber science, conducting design trade-off analysis, and developing appropriate public policy frameworks.

Bertino, Elisa, Brancik, Kenneth.  2021.  Services for Zero Trust Architectures - A Research Roadmap. 2021 IEEE International Conference on Web Services (ICWS). :14–20.
The notion of Zero Trust Architecture (ZTA) has been introduced as a fine-grained defense approach. It assumes that no entities outside and inside the protected system can be trusted and therefore requires articulated and high-coverage deployment of security controls. However, ZTA is a complex notion which does not have a single design solution; rather it consists of numerous interconnected concepts and processes that need to be assessed prior to deciding on a solution. In this paper, we outline a ZTA design methodology based on cyber risks and the identification of known high security risks. We then discuss challenges related to the design and deployment of ZTA and related solutions. We also discuss the role that service technology can play in ZTA.
Mehner, Luise, Voigt, Saskia Nuñez von, Tschorsch, Florian.  2021.  Towards Explaining Epsilon: A Worst-Case Study of Differential Privacy Risks. 2021 IEEE European Symposium on Security and Privacy Workshops (EuroS PW). :328–331.

Differential privacy is a concept to quantity the disclosure of private information that is controlled by the privacy parameter ε. However, an intuitive interpretation of ε is needed to explain the privacy loss to data engineers and data subjects. In this paper, we conduct a worst-case study of differential privacy risks. We generalize an existing model and reduce complexity to provide more understandable statements on the privacy loss. To this end, we analyze the impact of parameters and introduce the notion of a global privacy risk and global privacy leak.

Tekgul, Buse G. A., Xia, Yuxi, Marchal, Samuel, Asokan, N..  2021.  WAFFLE: Watermarking in Federated Learning. 2021 40th International Symposium on Reliable Distributed Systems (SRDS). :310–320.

Federated learning is a distributed learning technique where machine learning models are trained on client devices in which the local training data resides. The training is coordinated via a central server which is, typically, controlled by the intended owner of the resulting model. By avoiding the need to transport the training data to the central server, federated learning improves privacy and efficiency. But it raises the risk of model theft by clients because the resulting model is available on every client device. Even if the application software used for local training may attempt to prevent direct access to the model, a malicious client may bypass any such restrictions by reverse engineering the application software. Watermarking is a well-known deterrence method against model theft by providing the means for model owners to demonstrate ownership of their models. Several recent deep neural network (DNN) watermarking techniques use backdooring: training the models with additional mislabeled data. Backdooring requires full access to the training data and control of the training process. This is feasible when a single party trains the model in a centralized manner, but not in a federated learning setting where the training process and training data are distributed among several client devices. In this paper, we present WAFFLE, the first approach to watermark DNN models trained using federated learning. It introduces a retraining step at the server after each aggregation of local models into the global model. We show that WAFFLE efficiently embeds a resilient watermark into models incurring only negligible degradation in test accuracy (-0.17%), and does not require access to training data. We also introduce a novel technique to generate the backdoor used as a watermark. It outperforms prior techniques, imposing no communication, and low computational (+3.2%) overhead$^\textrm1$$^\textrm1$\$The research report version of this paper is also available in, and the code for reproducing our work can be found at

Zhang, Maojun, Zhu, Guangxu, Wang, Shuai, Jiang, Jiamo, Zhong, Caijun, Cui, Shuguang.  2021.  Accelerating Federated Edge Learning via Optimized Probabilistic Device Scheduling. 2021 IEEE 22nd International Workshop on Signal Processing Advances in Wireless Communications (SPAWC). :606–610.
The popular federated edge learning (FEEL) framework allows privacy-preserving collaborative model training via frequent learning-updates exchange between edge devices and server. Due to the constrained bandwidth, only a subset of devices can upload their updates at each communication round. This has led to an active research area in FEEL studying the optimal device scheduling policy for minimizing communication time. However, owing to the difficulty in quantifying the exact communication time, prior work in this area can only tackle the problem partially by considering either the communication rounds or per-round latency, while the total communication time is determined by both metrics. To close this gap, we make the first attempt in this paper to formulate and solve the communication time minimization problem. We first derive a tight bound to approximate the communication time through cross-disciplinary effort involving both learning theory for convergence analysis and communication theory for per-round latency analysis. Building on the analytical result, an optimized probabilistic scheduling policy is derived in closed-form by solving the approximate communication time minimization problem. It is found that the optimized policy gradually turns its priority from suppressing the remaining communication rounds to reducing per-round latency as the training process evolves. The effectiveness of the proposed scheme is demonstrated via a use case on collaborative 3D objective detection in autonomous driving.
Li, Xiaojian, Chen, Jing, Jiang, Yiyi, Hu, Hangping, Yang, Haopeng.  2021.  An Accountability-Oriented Generation approach to Time-Varying Structure of Cloud Service. 2021 IEEE International Conference on Services Computing (SCC). :413–418.
In the current cloud service development, during the widely used of cloud service, it can self organize and respond on demand when the cloud service in phenomenon of failure or violation, but it may still cause violation. The first step in forecasting or accountability for this situation, is to generate a dynamic structure of cloud services in a timely manner. In this research, it has presented a method to generate the time-varying structure of cloud service. Firstly, dependencies between tasks and even instances within a job of cloud service are visualized to explore the time-varying characteristics contained in the cloud service structure. And then, those dependencies are discovered quantitatively using CNN (Convolutional Neural Networks). Finally, it structured into an event network of cloud service for tracing violation and other usages. A validation to this approach has been examined by an experiment based on Alibaba’s dataset. A function integrity of this approach may up to 0.80, which is higher than Bai Y and others which is no more than 0.60.
Rathod, Viraj, Parekh, Chandresh, Dholariya, Dharati.  2021.  AI & ML Based Anamoly Detection and Response Using Ember Dataset. 2021 9th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO). :1–5.
In the era of rapid technological growth, malicious traffic has drawn increased attention. Most well-known offensive security assessment todays are heavily focused on pre-compromise. The amount of anomalous data in today's context is massive. Analyzing the data using primitive methods would be highly challenging. Solution to it is: If we can detect adversary behaviors in the early stage of compromise, one can prevent and safeguard themselves from various attacks including ransomwares and Zero-day attacks. Integration of new technologies Artificial Intelligence & Machine Learning with manual Anomaly Detection can provide automated machine-based detection which in return can provide the fast, error free, simplify & scalable Threat Detection & Response System. Endpoint Detection & Response (EDR) tools provide a unified view of complex intrusions using known adversarial behaviors to identify intrusion events. We have used the EMBER dataset, which is a labelled benchmark dataset. It is used to train machine learning models to detect malicious portable executable files. This dataset consists of features derived from 1.1 million binary files: 900,000 training samples among which 300,000 were malicious, 300,000 were benevolent, 300,000 un-labelled, and 200,000 evaluation samples among which 100K were malicious, 100K were benign. We have also included open-source code for extracting features from additional binaries, enabling the addition of additional sample features to the dataset.