Visible to the public Provenance 2015Conflict Detection Enabled

SoS Newsletter- Advanced Book Block


SoS Logo

Provenance 2015


Provenance refers to information about the origin and activities of system data and processes.  With the growth of shared services and systems, including social media, cloud computing, and service-oriented architectures, finding tamperproof methods for tracking files is a major challenge. Research into the security of software of unknown provenance (SOUP) is also included.  Provenance is important to the Science of Security relative to human behavior, metrics, resilience, and composability.  The work cited here was presented in 2015.

Jiun Yi Yap; Tomlinson, A., "Provenance-Based Attestation for Trustworthy Computing," in Trustcom/BigDataSE/ISPA, 2015 IEEE, vol.1, pp. 630-637, 20-22 Aug. 2015. doi: 10.1109/Trustcom.2015.428

Abstract: We present a new approach to the attestation of a computer's trustworthiness that is founded on provenance data of its key components. The prevailing method of attestation relies on comparing integrity measurements of the key components of a computer against a reference database of trustworthy integrity measurements. An integrity measurement is obtained by passing a software binary or any component through a hash function but this value carries little information unless there is a reference database. On the other hand, the semantics of provenance contain more details. There are expressive information such as the component's history and its causal dependencies with other elements of a computer. Hence, we argue that provenance data can be used as evidence of trustworthiness during attestation. In this paper, we describe a complete design for provenance-based attestation. The design development is guided by goals and it covers all the phases of this approach. We discuss about collecting provenance data and using the PROV data model to represent provenance data. To determine if provenance data of a component can provide evidence of its trustworthiness, we have developed a rule specification grammar and provided a discourse on using the rules. We then build the key mechanisms of this form of attestation by exploring approaches to capture provenance data and look at transforming the trust evaluation rules to XQuery language before running the rules against an XML based record of provenance data. Finally, the design is analyzed using threat modelling.

Keywords: XML; data models; trusted computing; PROV data model; XML based provenance data record; XQuery language; attestation prevailing method; computer trustworthiness attestation; hash function; key components; provenance data representation; provenance semantics; provenance-based attestation; rule specification grammar; software binary; threat modelling; trust evaluation rules; trustworthiness; trustworthy computing; trustworthy integrity measurements; Computational modeling; Computers; Data models; Databases; Semantics; Software; Software measurement; attestation; provenance; trustworthy computing (ID#: 15-8519)



Bany Taha, M.M.; Chaisiri, S.; Ko, R.K.L., "Trusted Tamper-Evident Data Provenance," in Trustcom/BigDataSE/ISPA, 2015 IEEE, vol. 1, pp. 646-653, 20-22 Aug. 2015. doi: 10.1109/Trustcom.2015.430

Abstract: Data provenance, the origin and derivation history of data, is commonly used for security auditing, forensics and data analysis. While provenance loggers provide evidence of data changes, the integrity of the provenance logs is also critical for the integrity of the forensics process. However, to our best knowledge, few solutions are able to fully satisfy this trust requirement. In this paper, we propose a framework to enable tamper-evidence and preserve the confidentiality and integrity of data provenance using the Trusted Platform Module (TPM). Our framework also stores provenance logs in trusted and backup servers to guarantee the availability of data provenance. Tampered provenance logs can be discovered and consequently recovered by retrieving the original logs from the servers. Leveraging on TPM's technical capability, our framework guarantees data provenance collected to be admissible, complete, and confidential. More importantly, this framework can be applied to capture tampering evidence in large-scale cloud environments at system, network, and application granularities. We applied our framework to provide tamper-evidence for Progger, a cloud-based, kernel-space logger. Our results demonstrate the ability to conduct remote attestation of Progger logs' integrity, and uphold the completeness, confidential and admissible requirements.

Keywords: cloud computing; data analysis; digital forensics; file servers; trusted computing; Progger log integrity; TPM; backup server; cloud environments; cloud-based logger; data analysis; data provenance confidentiality; data provenance integrity; forensic process analysis; kernel-space logger; provenance logger integrity; security auditing; trusted platform module; trusted server; trusted tamper-evident data provenance; Cloud computing; Generators; Kernel; Reliability; Runtime; Servers; Virtual machining; Accountability in Cloud Computing; Cloud Computing; Data Provenance; Data Security; Remote Attestation; Tamper Evidence; Trusted Computing; Trusted Platform Module (ID#: 15-8520)



Liang Chen; Edwards, P.; Nelson, J.D.; Norman, T.J., "An Access Control Model for Protecting Provenance Graphs," in Privacy, Security and Trust (PST), 2015 13th Annual Conference on, pp. 125-132, 21-23 July 2015. doi: 10.1109/PST.2015.7232963

Abstract: Securing provenance has recently become an important research topic, resulting in a number of models for protecting access to provenance. Existing work has focused on graph transformation mechanisms that supply a user with a provenance view that satisfies both access control policies and validity constraints of provenance. However, it is not always possible to satisfy both of them simultaneously, because these two conditions are often inconsistent which require sophisticated conflict resolution strategies to be put in place. In this paper we develop a new access control model tailored for provenance. In particular, we explicitly take into account validity constraints of provenance when specifying certain parts of provenance to which access is restricted. Hence, a provenance view that is granted to a user by our authorisation mechanism would automatically satisfy the validity constraints. Moreover, we propose algorithms that allow provenance owners to deploy fine-grained access control for their provenance data.

Keywords: authorisation; graph theory; access control model; access control policy; authorisation mechanism; fine-grained access control; graph transformation mechanism; provenance graph; provenance security; Authorization; Computers; Data models; Object recognition; Transforms (ID#: 15-8521)



Taotao Ma; Hua Wang; Jianming Yong; Yueai Zhao, "Causal Dependencies of Provenance Data in Healthcare Environment," in Computer Supported Cooperative Work in Design (CSCWD), 2015 IEEE 19th International Conference on, pp. 643-648, 6-8 May 2015. doi: 10.1109/CSCWD.2015.7231033

Abstract: Open Provenance Model (OPM) is a provenance model that can capture provenance data in terms of causal dependencies among the provenance data model components. Causal dependencies are relationships between an event (the cause) and a second event (the effect), where the second event is understood as a physical consequence of the first. Causal dependencies can represent a set of entities that are necessary and sufficient to explain the presence of another entity. A provenance model is able to describe the provenance of any data at an abstract layer, but does not explicitly capture causal dependencies that are a vital challenge since the lacks of the relations in OPM, especially in healthcare environment. In this paper, we analyse the causal dependencies between entities in a medical workflow system with OPM graphs.

Keywords: authorisation; causality; graph theory; health care; medical information systems; open systems; OPM graph; access control; causal dependency; health care environment; medical workflow system; open provenance model; provenance data; Artificial intelligence; Blood pressure; Kidney; Lifting equipment; Medical services; Registers; access control; causal dependencies; provenance; security (ID#: 15-8522)



Mohy, N.N.; Mokhtar, H.M.O.; El-Sharkawi, M.E., "Delegation Enabled Provenance-based Access Control Model," in Science and Information Conference (SAI), 2015, pp. 1374-1379, 28-30 July 2015. doi: 10.1109/SAI.2015.7237321

Abstract: Any organization aims to achieve its business objectives, secure its information, and conforms to policies and regulations. Provenance can help organizations achieve these goals. As provenance stores the history of the organization's workflow, it can be used for auditing, compliance, checking errors and securing the business. Provenance Based Access Control (PBAC) is one of the new access control models that used to secure data based on its provenance. This paper introduces Delegation Provenance based Access Control (DPBAC) model that accounts for the delegation of access rights and also introduce an extension to the Open Provenance Model (OPM) in order to store the history of the delegation to be used for auditing purposes.

Keywords: authorisation; open systems; DBPBAC model; OPM; access control model; auditing purpose; delegation provenance based access control;  information security; open provenance model; Access control; Data models; History; Organizations; Permission; Process control; Standards organizations; OPM; Provenance; access control; delegation (ID#: 15-8523)



Cuzzocrea, A., "Provenance Research Issues and Challenges in the Big Data Era," in Computer Software and Applications Conference (COMPSAC), 2015 IEEE 39th Annual, vol. 3, pp. 684-686, 1-5 July 2015. doi: 10.1109/COMPSAC.2015.345

Abstract: Provenance of Big Data is a hot-topic in the database and data mining research communities. Basically, provenance is the process of detecting the lineage and the derivation of data and data objects, and it plays a major role in database management systems as well as in workflow management systems and distributed systems. Despite this, provenance of big data research is still in its embryonic phase, and a lot of efforts must still be done in this area. Inspired by these considerations, in this paper we provide an overview of relevant issues and challenges in the context of big data provenance research, by also highlighting possible future efforts within these research directions.

Keywords: Big Data; data mining; database management systems; distributed processing; big data era; big data provenance; data mining research communities; database management systems; distributed systems; embryonic phase; provenance research issues; workflow management systems; Big data; Computational modeling; Conferences; Context; Data privacy; Databases; Security; Data Provenance; Provenance of Big Data (ID#: 15-8524)



Katilu, V.M.; Franqueira, V.N.L.; Angelopoulou, O., "Challenges of Data Provenance for Cloud Forensic Investigations," in Availability, Reliability and Security (ARES), 2015 10th International Conference on, pp. 312-317, 24-27 Aug. 2015. doi: 10.1109/ARES.2015.54

Abstract: Cloud computing has gained popularity due to its efficiency, robustness and cost effectiveness. Carrying out digital forensic investigations in the cloud is currently a relevant and open issue. The root of this issue is the fact that servers cannot be physically accessed, coupled with the dynamic and distributed nature of cloud computing with regards to data processing and storage. This renders traditional methods of evidence collection impractical. The use of provenance data in cloud forensics is critical as it provides forensic investigators with data history in terms of people, entities and activities involved in producing related data objects. Therefore, cloud forensics requires effective provenance collection mechanisms. This paper provides an overview of current provenance challenges in cloud computing and identifies limitations of current provenance collection mechanisms. Recommendations for additional research in digital provenance for cloud forensics are also presented.

Keywords: cloud computing; digital forensics; cloud computing; cloud digital forensic investigation; data history; data objects; data processing; data provenance; data storage; evidence collection; provenance collection mechanism; Cloud computing; Forensics; Kernel; Monitoring; Reliability; Security; Servers; Cloud Computing; Cloud Forensics; Provenance (ID#: 15-8525)



Cong Liao; Squicciarini, A., "Towards Provenance-Based Anomaly Detection in MapReduce," in Cluster, Cloud and Grid Computing (CCGrid), 2015 15th IEEE/ACM International Symposium on, pp. 647-656, 4-7 May 2015. doi: 10.1109/CCGrid.2015.16

Abstract: MapReduce enables parallel and distributed processing of vast amount of data on a cluster of machines. However, such computing paradigm is subject to threats posed by malicious and cheating nodes or compromised user submitted code that could tamper data and computation since users maintain little control as the computation is carried out in a distributed fashion. In this paper, we focus on the analysis and detection of anomalies during the process of MapReduce computation. Accordingly, we develop a computational provenance system that captures provenance data related to MapReduce computation within the MapReduce framework in Hadoop. In particular, we identify a set of invariants against aggregated provenance information, which are later analyzed to uncover anomalies indicating possible tampering of data and computation. We conduct a series of experiments to show the efficiency and effectiveness of our proposed provenance system.

Keywords: data analysis; parallel processing; security of data; Hadoop; MapReduce computation; computational provenance system; data tampering; provenance-based anomaly detection; Access control; Cloud computing; Containers; Distributed databases; Monitoring; Yarn; MapReduce; computation integrity; logging; provenance (ID#: 15-8526)



Khan, R.; Hasan, R., "Fuzzy Authentication Using Interaction Provenance in Service Oriented Computing," in Services Computing (SCC), 2015 IEEE International Conference on, pp. 170-177, June 27 2015-July 2 2015. doi: 10.1109/SCC.2015.32

Abstract: In service oriented computing, authentication factors have their vulnerabilities when considered exclusively. Cross-platform and service composition architectures require a complex integration procedure and limit adoptability of newer authentication models. Authentication is generally based on a binary success or failure and relies on credentials proffered at the present moment without considering how or when the credentials were obtained by the subject. The resulting access control engines suffer from rigid service policies and complexity of management. In contrast, social authentication is based on the nature, quality, and length of previous encounters with each other. We posit that human-to-machine authentication is a similar causal effect of an earlier interaction with the verifying party. We use this notion to propose interaction provenance as the only unified representation model for all authentication factors in service oriented computing. Interaction provenance uses the causal relationship of past events to leverage service composition, cross-platform integration, timeline authentication, and easier adoption of newer methods. We extend our model with fuzzy authentication using past interactions and linguistic policies. The paper presents an interaction provenance recording and authentication protocol and a proof-of-concept implementation with extensive experimental evaluation.

Keywords: fuzzy set theory; security of data; service-oriented architecture; authentication factors; authentication protocol; complex integration procedure; cross-platform architectures; cross-platform integration; fuzzy authentication; human-to-machine authentication; interaction provenance; interaction provenance recording; leverage service composition; proof-of-concept implementation; service composition architectures; service oriented computing; social authentication; timeline authentication; Access control; Authentication; Computational modeling; IP networks; Pragmatics; Protocols; Servers; Access Control; Authentication; Events; Fuzzy; Interaction Provenance; Persona; Security (ID#: 15-8527)



Levchuk, G.; Blasch, E., "Probabilistic Graphical Models for Multi-Source Fusion from Text Sources," in Computational Intelligence for Security and Defense Applications (CISDA), 2015 IEEE Symposium on,  pp.1-10, 26-28 May 2015. doi: 10.1109/CISDA.2015.7208640

Abstract: In this paper we present probabilistic graph fusion algorithms to support information fusion and reasoning over multi-source text media. Our methods resolve misinformation by combining knowledge similarity analysis and conflict identification with source characterization. For experimental purposes, we used the dataset of the articles about current military conflict in Eastern Ukraine. We show that automated knowledge fusion and conflict detection is feasible and high accuracy of detection can be obtained. However, to correctly classify mismatched knowledge fragments as misinformation versus additionally reported facts, the knowledge reliability and credibility must be assessed. Since the true knowledge must be reported by many reliable sources, we compute knowledge frequency and source reliability by incorporating knowledge provenance and analyzing historical consistency between the knowledge reported by the sources in our dataset.

Keywords: information dissemination; pattern classification; probability; reliability; sensor fusion; Eastern Ukraine; information fusion; knowledge credibility; knowledge fusion; knowledge reliability; knowledge similarity analysis; mismatched knowledge fragment classification; multisource fusion; multisource text media; probabilistic graph fusion algorithm; probabilistic graphical model; source characterization; source reliability; Data mining; Government; Information retrieval; Joints; Media; Probabilistic logic; Semantics; graphical fusion; information wars; knowledge graph; misinformation detection; multi-source fusion; open source exploitation; situation assessment (ID#: 15-8528)



Christou, C.T.; Jacyna, G.M.; Goodman, F.J.; Deanto, D.G.; Masters, D., "Geolocation Analysis Using Maxent And Plant Sample Data," in Technologies for Homeland Security (HST), 2015 IEEE International Symposium on, pp. 1-6, 14-16 April 2015. doi: 10.1109/THS.2015.7225273

Abstract: A study was conducted to assess the feasibility of geolocation based on correctly identifying pollen samples found on goods or people for purposes of compliance with U.S. import laws and criminal forensics. The analysis was based on Neotropical plant data sets from the Global Biodiversity Information Facility. The data were processed through the software algorithm Maxent that calculates plant probability geographic distributions of maximum entropy, subject to constraints. Derivation of single and joint continuous probability densities of geographic points, for single and multiple taxa occurrences, were performed. Statistical metrics were calculated directly from the output of Maxent for single taxon probabilities and were mathematically derived for joint taxa probabilities. Predictions of likeliest geographic regions at a given probability percentage level were made, along with the total corresponding geographic ranges. We found that joint probability distributions greatly restrict the areas of possible provenance of pollen samples.

Keywords: entropy; geographic information systems; law; sampled data systems; statistical distributions; Maxent; Neotropical plant data sets; U.S. import laws; criminal forensics; geolocation analysis; global biodiversity information facility; joint probability distributions; maximum entropy; plant sample data; pollen samples; probability geographic distributions; software algorithm; statistical metrics; Geology; Joints; Logistics; Measurement; Probability distribution; Standards; Neotropics; environmental variables; forensics geolocation; marginal and joint probability distributions; maximum entropy; plant occurrences; pollen analytes; statistical metrics (ID#: 15-8529)



Dogan, G.; Avincan, K.; Brown, T., "Provenance and Trust as New Factors for Self-Organization in a Wireless Sensor Network," in Signal Processing and Communications Applications Conference (SIU), 2015 23th, pp. 544-547, 16-19 May 2015. doi: 10.1109/SIU.2015.7129881

Abstract: Trust can be an important component of wireless sensor networks for believability of the produced data and trust history is a crucial asset in deciding trust of the data. In our previous work, we developed an architecture called ProTru and we showed how provenance can be used for registering previous trust records and other information such as node type, data type, node location, average of historical data. We designed a distributed trust enhancing architecture using only local provenance during sensor fusion with a low communication overhead. Our network is cognitive in the sense that our system reacts automatically upon detecting low trust and restructures itself. In this work, we are extending our previous architecture by storing dataflow provenance graphs. This feature will enhance the cognitive abilities of our system by giving the network the capability of remembering past network snapshots.

 Keywords: graph theory; sensor fusion; telecommunication security; wireless sensor networks; ProTru architecture; cognitive abilities; data type; dataflow provenance graphs; distributed trust enhancing architecture; historical data; local provenance; low communication overhead; node location; node type; past network snapshots; self-organization; sensor fusion;  trust records; wireless sensor network; Cities and towns; Conferences; History; Military communication; Mobile communication; Security; Wireless sensor networks; Distributed Intelligence; Provenance; Self Organization; Trust; Wireless Sensor Networks (ID#: 15-8530)



Xin Li; Joshi, C.; Tan, A.Y.S.; Ko, R.K.L., "Inferring User Actions from Provenance Logs," in Trustcom/BigDataSE/ISPA, 2015 IEEE, vol. 1, pp. 742-749, 20-22 Aug. 2015. doi: 10.1109/Trustcom.2015.442

Abstract: Progger, a kernel-spaced cloud data provenance logger which provides fine-grained data activity records, was recently developed to empower cloud stakeholders to trace data life cycles within and across clouds. Progger logs have the potential to allow analysts to infer user actions and create a data-centric behaviour history in a cloud computing environment. However, the Progger logs are complex and noisy and therefore, currently this potential can not be met. This paper proposes a statistical approach to efficiently infer the user actions from the Progger logs. Inferring logs which capture activities at kernel-level granularity is not a straightforward endeavour. This paper overcomes this challenge through an approach which shows a high level of accuracy. The key aspects of this approach are identifying the data preprocessing steps and attribute selection. We then use four standard classification models and identify the model which provides the most accurate inference on user actions. To our best knowledge, this is the first work of its kind. We also discuss a number of possible extensions to this work. Possible future applications include the ability to predict an anomalous security activity before it occurs.

Keywords: cloud computing; data loggers; data mining; inference mechanisms; pattern classification; security of data; Progger logs; anomalous security activity prediction; attribute selection; classification models; cloud computing environment; data activity records; data life cycle tracing; data preprocessing step identification; data-centric behaviour history; kernel-spaced cloud data provenance logger; log mining; provenance logs; user action inference; Cloud computing; Data models; Data preprocessing; Data security; Kernel; Testing; Training data; Cloud Computing; Data Provenance; Data Security; Data-centric Logger; Log Mining; Progger; Provenance Mining; User Actions (ID#: 15-8531)



Meera, G.; Geethakumari, G., "A Provenance Auditing Framework for Cloud Computing Systems," in Signal Processing, Informatics, Communication and Energy Systems (SPICES), 2015 IEEE International Conference on, pp. 1-5, 19-21 Feb. 2015. doi: 10.1109/SPICES.2015.7091427

Abstract: Cloud computing is a service oriented paradigm that aims at sharing resources among a massive number of tenants and users. This sharing facility that it provides coupled with the sheer number of users make cloud environments susceptible to major security risks. Hence, security and auditing of cloud systems is of great relevance. Provenance is a meta-data history of objects which aid in verifiability, accountability and lineage tracking. Incorporating provenance to cloud systems can help in fault detection. This paper proposes a framework which aims at performing secure provenance audit of clouds across applications and multiple guest operating systems. For integrity preservation and verification, we use established cryptographic techniques. We look at it from the cloud service providers' perspective as improving cloud security can result in better trust relations with customers.

Keywords: auditing; cloud computing; cryptography; data integrity; fault diagnosis; meta data; resource allocation; service-oriented architecture; trusted computing; accountability; cloud computing systems; cloud environments; cloud security; cloud service providers; cryptographic techniques; fault detection; integrity preservation; integrity verification; lineage tracking; metadata history; operating systems; provenance auditing framework; resource sharing; security risks; service oriented paradigm; sharing facility; trust relations; verifiability; Cloud computing; Cryptography; Digital forensics; Monitoring; Virtual machining; Auditing; Cloud computing; Provenance (ID#: 15-8532)



Donghoon Kim; Vouk, M.A., "Securing Scientific Workflows," in Software Quality, Reliability and Security - Companion (QRS-C), 2015 IEEE International Conference on, pp. 95-104, 3-5 Aug. 2015. doi: 10.1109/QRS-C.2015.25

Abstract: This paper investigates security of Kepler scientific workflow engine. We are especially interested in Kepler-based scientific workflows that may operate in cloud environments. We find that (1) three security properties (i.e., input validation, remote access validation, and data integrity) are essential for making Kepler-based workflows more secure, and (2) that use of the Kepler provenance module may help secure Kepler based workflows. We implemented a prototype security enhanced Kepler engine to demonstrate viability of use of the Kepler provenance module in provision and management of the desired security properties.

Keywords: authorisation; cloud computing; data integrity; scientific information systems; workflow management software; Kepler provenance module; Kepler scientific workflow engine security; cloud environment; data integrity; input validation; remote access validation; Cloud computing; Conferences; Databases; Engines; Security; Software quality; Uniform resource locators; Cloud; Kepler; Provenance; Scientific workflow; Vulnerability (ID#: 15-8533)



Kalaivani, K.; Suguna, C., "Efficient Botnet Detection Based on Reputation Model and Content Auditing in P2P Networks," in Intelligent Systems and Control (ISCO), 2015 IEEE 9th International Conference on, pp. 1-4, 9-10 Jan. 2015. doi: 10.1109/ISCO.2015.7282358

Abstract: Botnet is a number of computers connected through internet that can send malicious content such as spam and virus to other computers without the knowledge of the owners. In peer-to-peer (p2p) architecture, it is very difficult to identify the botnets because it does not have any centralized control. In this paper, we are going to use a security principle called data provenance integrity. It can verify the origin of the data. For this, the certificate of the peers can be exchanged. A reputation based trust model is used for identifying the authenticated peer during file transmission. Here the reputation value of each peer can be calculated and a hash table is used for efficient file searching. The proposed system can also verify the trustworthiness of transmitted data by using content auditing. In this, the data can be checked against trained data set and can identify the malicious content.

Keywords: authorisation; computer network security; data integrity; information retrieval; invasive software; peer-to-peer computing; trusted computing;P2P networks; authenticated peer; botnet detection; content auditing; data provenance integrity; file searching; file transmission; hash table; malicious content; peer-to-peer architecture; reputation based trust model; reputation model; reputation value; security principle; spam; transmitted data trustworthiness; virus; Computational modeling; Cryptography; Measurement; Peer-to-peer computing; Privacy; Superluminescent diodes; Data provenance integrity; content auditing; reputation value; trained data set (ID#: 15-8534)



Ashwin Kumar, T.K.; Hong Liu; Thomas, J.P.; Mylavarapu, G., "Identifying Sensitive Data Items within Hadoop," in High Performance Computing and Communications (HPCC), 2015 IEEE 7th International Symposium on Cyberspace Safety and Security (CSS), 2015 IEEE 12th International Conference on Embedded Software and Systems (ICESS), 2015 IEEE 17th International Conference on, pp. 1308-1313, 24-26 Aug. 2015. doi: 10.1109/HPCC-CSS-ICESS.2015.293

Abstract: Recent growth in big-data is raising security and privacy concerns. Organizations that collect data from various sources are at a risk of legal or business liabilities due to security breach and exposure of sensitive information. Only file-level access control is feasible in current Hadoop implementation and the sensitive information can only be identified manually or from the information provided by the data owner. The problem of identifying sensitive information manually gets complicated due to different types of data. When sensitive information is accessed by an unauthorized user or misused by an authorized person, they can compromise privacy. This paper is the first part of our intended access control framework for Hadoop and it automates the process of identifying sensitive data items manually. To identify such data items, the proposed framework harnesses data context, usage patterns and data provenance. In addition to this the proposed framework can also keep track of the data lineage.

Keywords: Big Data; authorisation; data handling; data privacy; parallel processing; Big-Data; Hadoop; access control framework; authorized person; business liabilities; data collection; data context; data lineage; data privacy; data provenance; data security; file-level access control; information misuse; legal liabilities; security breach; sensitive data item identification; sensitive information access; sensitive information exposure; sensitive information identification; unauthorized user; usage patterns; Access control; Context; Electromyography; Generators; Metadata; Neural networks; Sensitivity; Hadoop; data context; data lineage; data provenance; file-level access control; privacy; sensitive information; usage patterns (ID#: 15-8535)



Mayhew, Michael; Atighetchi, Michael; Adler, Aaron; Greenstadt, Rachel, "Use of Machine Learning in Big Data Analytics for Insider Threat Detection," in Military Communications Conference, MILCOM 2015 - 2015 IEEE, pp. 915-922, 26-28 Oct. 2015. doi: 10.1109/MILCOM.2015.7357562

Abstract: In current enterprise environments, information is becoming more readily accessible across a wide range of interconnected systems. However, trustworthiness of documents and actors is not explicitly measured, leaving actors unaware of how latest security events may have impacted the trustworthiness of the information being used and the actors involved. This leads to situations where information producers give documents to consumers they should not trust and consumers use information from non-reputable documents or producers. The concepts and technologies developed as part of the Behavior-Based Access Control (BBAC) effort strive to overcome these limitations by means of performing accurate calculations of trustworthiness of actors, e.g., behavior and usage patterns, as well as documents, e.g., provenance and workflow data dependencies. BBAC analyses a wide range of observables for mal-behavior, including network connections, HTTP requests, English text exchanges through emails or chat messages, and edit sequences to documents. The current prototype service strategically combines big data batch processing to train classifiers and real-time stream processing to classifier observed behaviors at multiple layers. To scale up to enterprise regimes, BBAC combines clustering analysis with statistical classification in a way that maintains an adjustable number of classifiers.

Keywords: Access control; Big data; Computer security; Electronic mail; Feature extraction; Monitoring; HTTP; TCP; big data; chat; documents; email; insider threat; machine learning; support vector machine; trust; usage patterns (ID#: 15-8536)



Thuraisingham, B.; Cadenhead, T.; Kantarcioglu, M.; Khadilkar, V., "Design and Implementation of a Semantic Web-Based Inference Controller: A Summary," in Information Reuse and Integration (IRI), 2015 IEEE International Conference on, pp. 451-456, 13-15 Aug. 2015. doi: 10.1109/IRI.2015.75

Abstract: This paper provides a summary of the design and implementation of a prototype inference controller that operates over a provenance graph and protects important provenance information from unauthorized users. We use as our data model the Resource Description Framework (RDF), which supports the interoperability of multiple databases having disparate data schemas. In addition, we express policies and rules in terms of Semantic Web rules.

Keywords: inference mechanisms; semantic Web; RDF; data model; disparate data schemas; provenance graph; resource description framework; semantic Web-based inference controller; Cognition; Knowledge based systems; Process control; Query processing; Resource description framework; Security (ID#: 15-8537)



Kun Yang; Forte, D.; Tehranipoor, M., "An RFID-based technology for electronic component and system Counterfeit detection and Traceability," in Technologies for Homeland Security (HST), 2015 IEEE International Symposium on, pp. 1-6, 14-16 April 2015. doi: 10.1109/THS.2015.7225279

Abstract: The vulnerabilities in today's supply chain have raised serious concerns about the security and trustworthiness of electronic components and systems. Testing for device provenance, detection of counterfeit integrated circuits/systems, and traceability are challenging issues to address. In this paper, we develop a novel RFID-based system suitable for electronic component and system Counterfeit detection and System Traceability called CST. CST is composed of different types of on-chip sensors and in-system structures that provide the information needed to detect multiple counterfeit IC types (recycled, cloned, etc.), verify the authenticity of the system with some degree of confidence, and track/identify boards. Central to CST is an RFID tag employed as storage and a channel to read the information from different types of chips on the printed circuit board (PCB) in both power-off and power-on scenarios. Simulations and experimental results using Spartan 3E FPGAs demonstrate the effectiveness of this system. The efficiency of the radio frequency (RF) communication has also been verified via a PCB prototype with a printed slot antenna.

Keywords: counterfeit goods; field programmable gate arrays; microstrip antennas; printed circuits; production engineering computing; radiofrequency identification; supply chains; CST; PCB prototype; RF; RFID tag; RFID-based system; RFID-based technology; Spartan 3E FPGA; counterfeit integrated circuits; device provenance; electronic component; In-system structures; multiple counterfeit IC types; on-chip sensors; printed circuit board; printed slot antenna; radio frequency communication; supply chain; system counterfeit detection; Electronic components; Field programmable gate arrays; Radiation detectors; Radio frequency; Radiofrequency identification; Sensor systems (ID#: 15-8538)



Jilcott, S., "Scalable Malware Forensics Using Phylogenetic Analysis," in Technologies for Homeland Security (HST), 2015 IEEE International Symposium on, pp. 1-6, 14-16 April 2015. doi: 10.1109/THS.2015.7225311

Abstract: Malware forensics analysts confront one of our biggest homeland security challenges - a continuing flood of new malware variants released by adaptable adversaries seeking new targets in cyberspace, exploiting new technologies, and bypassing existing security mechanisms. Reverse engineering new samples, understanding their capabilities, and ascertaining provenance is time-intensive and requires considerable human expertise. We present DECODE, a prototype malware forensics analysis system developed under DARPA's Cyber Genome program. DECODE increases the actionable forensics derivable from large repositories of collected malware by quickly identifying a new malware sample as a variant of other malware samples, without relying on pre-existing anti-virus signatures. DECODE also accelerates reverse engineering efforts by quickly identifying parts of the malware that have already been seen in other samples and characterizing the new and different capabilities. DECODE can also reconstruct the evolution of malware variants over time. DECODE applies phylogenetic analysis to provide these advantages. Phylogenetic analysis is the study of similarities and differences in program structure to find relationships within groups of software programs, providing insights about new malware variants not available from signature-based malware detection.

Keywords: digital forensics; invasive software; reverse engineering; statistical analysis; DECODE; malware forensics; phylogenetic analysis; program structure; reverse engineering; Acceleration; Irrigation; Phylogeny; Pipelines; formatting; insert; style; styling (ID#: 15-8539)



Articles listed on these pages have been found on publicly available internet pages and are cited with links to those pages. Some of the information included herein has been reprinted with permission from the authors or data repositories. Direct any requests via Email to for removal of the links or modifications