# Biblio

The use of Knuth's Rule and Bayesian Blocks constant piecewise models for characterization of RFID traffic has been proposed already. This study presents an evaluation of the application of those two modeling techniques for various RFID traffic patterns. The data sets used in this study consist of time series of binned RFID command counts. More specifically., we compare the shape of several empirical plots of raw data sets we obtained from experimental RIFD readings., against the constant piecewise graphs produced as an output of the two modeling algorithms. One issue limiting the applicability of modeling techniques to RFID traffic is the fact that there are a large number of various RFID applications available. We consider this phenomenon to present the main motivation for this study. The general expectation is that the RFID traffic traces from different applications would be sequences with different histogram shapes. Therefore., no modeling technique could be considered universal for modeling the traffic from multiple RFID applications., without first evaluating its model performance for various traffic patterns. We postulate that differences in traffic patterns are present if the histograms of two different sets of RFID traces form visually different plot shapes.

In this paper we propose a new algorithm to detect Advanced Persistent Threats (APT's) that relies on a graph model of HTTP traffic. We also implement a complete detection system with a web interface that allows to interactively analyze the data. We perform a complete parameter study and experimental evaluation using data collected on a real network. The results show that the performance of our system is comparable to currently available antiviruses, although antiviruses use signatures to detect known malwares while our algorithm solely uses behavior analysis to detect new undocumented attacks.

Better understanding of mobile applications' behaviors would lead to better malware detection/classification and better app recommendation for users. In this work, we design a framework AppDNA to automatically generate a compact representation for each app to comprehensively profile its behaviors. The behavior difference between two apps can be measured by the distance between their representations. As a result, the versatile representation can be generated once for each app, and then be used for a wide variety of objectives, including malware detection, app categorizing, plagiarism detection, etc. Based on a systematic and deep understanding of an app's behavior, we propose to perform a function-call-graph-based app profiling. We carefully design a graph-encoding method to convert a typically extremely large call-graph to a 64-dimension fix-size vector to achieve robust app profiling. Our extensive evaluations based on 86,332 benign and malicious apps demonstrate that our system performs app profiling (thus malware detection, classification, and app recommendation) to a high accuracy with extremely low computation cost: it classifies 4024 (benign/malware) apps using around 5.06 second with accuracy about 93.07%; it classifies 570 malware's family (total 21 families) using around 0.83 second with accuracy 82.3%; it classifies 9,730 apps' functionality with accuracy 33.3% for a total of 7 categories and accuracy of 88.1 % for 2 categories.

In this paper we present a new approach, named DLGraph, for malware detection using deep learning and graph embedding. DLGraph employs two stacked denoising autoencoders (SDAs) for representation learning, taking into consideration computer programs' function-call graphs and Windows application programming interface (API) calls. Given a program, we first use a graph embedding technique that maps the program's function-call graph to a vector in a low-dimensional feature space. One SDA in our deep learning model is used to learn a latent representation of the embedded vector of the function-call graph. The other SDA in our model is used to learn a latent representation of the given program's Windows API calls. The two learned latent representations are then merged to form a combined feature vector. Finally, we use softmax regression to classify the combined feature vector for predicting whether the given program is malware or not. Experimental results based on different datasets demonstrate the effectiveness of the proposed approach and its superiority over a related method.

Smartphones have evolved over the years from simple devices to communicate with each other to fully functional portable computers although with comparatively less computational power but inholding multiple applications within. With the smartphone revolution, the value of personal data has increased. As technological complexities increase, so do the vulnerabilities in the system. Smartphones are the latest target for attacks. Android being an open source platform and also the most widely used smartphone OS draws the attention of many malware writers to exploit the vulnerabilities of it. Attackers try to take advantage of these vulnerabilities and fool the user and misuse their data. Malwares have come a long way from simple worms to sophisticated DDOS using Botnets, the latest trends in computer malware tend to go in the distributed direction, to evade the multiple anti-virus apps developed to counter generic viruses and Trojans. However, the recent trend in android system is to have a combination of applications which acts as malware. The applications are benign individually but when grouped, these may result into a malicious activity. This paper proposes a new category of distributed malware in android system, how it can be used to evade the current security, and how it can be detected with the help of graph matching algorithm.

The latent behavior of an information system that can exhibit extreme events, such as system faults or cyber-attacks, is complex. Recently, the invariant network has shown to be a powerful way of characterizing complex system behaviors. Structures and evolutions of the invariance network, in particular, the vanishing correlations, can shed light on identifying causal anomalies and performing system diagnosis. However, due to the dynamic and complex nature of real-world information systems, learning a reliable invariant network in a new environment often requires continuous collecting and analyzing the system surveillance data for several weeks or even months. Although the invariant networks learned from old environments have some common entities and entity relationships, these networks cannot be directly borrowed for the new environment due to the domain variety problem. To avoid the prohibitive time and resource consuming network building process, we propose TINET, a knowledge transfer based model for accelerating invariant network construction. In particular, we first propose an entity estimation model to estimate the probability of each source domain entity that can be included in the final invariant network of the target domain. Then, we propose a dependency construction model for constructing the unbiased dependency relationships by solving a two-constraint optimization problem. Extensive experiments on both synthetic and real-world datasets demonstrate the effectiveness and efficiency of TINET. We also apply TINET to a real enterprise security system for intrusion detection. TINET achieves superior detection performance at least 20 days lead-lag time in advance with more than 75% accuracy.

Community detection in complex networks is a fundamental problem that attracts much attention across various disciplines. Previous studies have been mostly focusing on external connections between nodes (i.e., topology structure) in the network whereas largely ignoring internal intricacies (i.e., local behavior) of each node. A pair of nodes without any interaction can still share similar internal behaviors. For example, in an enterprise information network, compromised computers controlled by the same intruder often demonstrate similar abnormal behaviors even if they do not connect with each other. In this paper, we study the problem of community detection in enterprise information networks, where large-scale internal events and external events coexist on each host. The discovered host communities, capturing behavioral affinity, can benefit many comparative analysis tasks such as host anomaly assessment. In particular, we propose a novel community detection framework to identify behavior-based host communities in enterprise information networks, purely based on large-scale heterogeneous event data. We continue proposing an efficient method for assessing host's anomaly level by leveraging the detected host communities. Experimental results on enterprise networks demonstrate the effectiveness of our model.

Android applications are usually obfuscated before release, making it difficult to analyze them for malware presence or intellectual property violations. Obfuscators might hide the true intent of code by renaming variables and/or modifying program structures. It is challenging to search for executables relevant to an obfuscated application for developers to analyze efficiently. Prior approaches toward obfuscation resilient search have relied on certain structural parts of apps remaining as landmarks, un-touched by obfuscation. For instance, some prior approaches have assumed that the structural relationships between identifiers are not broken by obfuscators; others have assumed that control flow graphs maintain their structures. Both approaches can be easily defeated by a motivated obfuscator. We present a new approach, MACNETO, to search for programs relevant to obfuscated executables leveraging deep learning and principal components on instructions. MACNETO makes few assumptions about the kinds of modifications that an obfuscator might perform. We show that it has high search precision for executables obfuscated by a state-of-the-art obfuscator that changes control flow. Further, we also demonstrate the potential of MACNETO to help developers understand executables, where MACNETO infers keywords (which are from relevant un-obfuscated programs) for obfuscated executables.

Given graphs with millions or billions of vertices and edges, how can we efficiently make inferences based on partial knowledge? Loopy Belief Propagation(LBP) is a graph inference algorithm widely used in various applications including social network analysis, malware detection, recommendation, and image restoration. The algorithm calculates approximate marginal probabilities of vertices in a graph within a linear running time proportional to the number of edges. However, when it comes to real-world graphs with millions or billions of vertices and edges, this cost overwhelms the computing power of a single machine. Moreover, this kind of large-scale graphs does not fit into the memory of a single machine. Although several distributed LBP methods have been proposed, previous works do not consider the properties of real-world graphs, especially the effect of power-law degree distribution on LBP. Therefore, our work focuses on developing a fast and scalable LBP for such large real-world graphs on distributed environment. In this paper, we propose DLBP, a Distributed Loopy Belief Propagation algorithm which efficiently computes LBP in a distributed manner across multiple machines. By setting the correct convergence criterion and carefully scheduling the computations, DLBP provides up to 10.7x speed up compared to standard distributed LBP. We show that DLBP demonstrates near-linear scalability with respect to the number of machines as well as the number of edges.

In this paper, we propose a graph-based algorithmic technique for malware detection, utilizing the System-call Dependency Graphs (ScDG) obtained through taint analysis traces. We leverage the grouping of system-calls into system-call groups with respect to their functionality to merge disjoint vertices of ScDG graphs, transforming them to Group Relation Graphs (GrG); note that, the GrG graphs represent malware's behavior being hence more resilient to probable mutations of its structure. More precisely, we extend the use of GrG graphs by mapping their vertices on the plane utilizing the degrees and the vertex-weights of a specific underlying graph of the GrG graph as to compute domination relations. Furthermore, we investigate how the activity of each system-call group could be utilized in order to distinguish graph-representations of malware and benign software. The domination relations among the vertices of GrG graphs result to a new graph representation that we call Coverage Graph of the GrG graph. Finally, we evaluate the potentials of our detection model using graph similarity between Coverage Graphs of known malicious and benign software samples of various types.

The popularity of Android, not only in handsets but also in IoT devices, makes it a very attractive target for malware threats, which are actually expanding at a significant rate. The state-of-the-art in malware mitigation solutions mainly focuses on the detection of malicious Android apps using dynamic and static analysis features to segregate malicious apps from benign ones. Nevertheless, there is a small coverage for the Internet/network dimension of Android malicious apps. In this paper, we present ToGather, an automatic investigation framework that takes Android malware samples as input and produces insights about the underlying malicious cyber infrastructures. ToGather leverages state-of-the-art graph theory techniques to generate actionable, relevant and granular intelligence to mitigate the threat effects induced by the malicious Internet activity of Android malware apps. We evaluate ToGather on a large dataset of real malware samples from various Android families, and the obtained results are both interesting and promising.

A Mobile ad hoc network (MANET) is a set of nodes that communicate together in a cooperative way using the wireless medium, and without any central administration. Due to its inherent open nature and the lack of infrastructure, security is a complicated issue compared to other networks. That is, these networks are vulnerable to a a wide range of attacks at different network layers. At the network level, malicious nodes can perform several attacks ranging from passive eavesdropping to active interfering. Wormhole is an example of severe attack that has attracted much attention recently. It involves the redirection of traffic between two end-nodes through a Wormhole tunnel, and manipulates the routing algorithm to give illusion that nodes located far from each other are neighbors. To handle with this issue, we propose a novel detection model to allow a node to check whether a presumed shortest path contains a Wormhole tunnel or not. Our approach is based on the fact that the Wormhole tunnel reduces significantly the length of the paths passing through it.

It is technically challenging to conduct a security analysis of a dynamic network, due to the lack of methods and techniques to capture different security postures as the network changes. Graphical Security Models (e.g., Attack Graph) are used to assess the security of network systems, but it typically captures a snapshot of a network state to carry out the security analysis. To address this issue, we propose a new Graphical Security Model named Time-independent Hierarchical Attack Representation Model (Ti-HARM) that captures security of multiple network states by taking into account the time duration of each network state and the visibility of network components (e.g., hosts, edges) in each state. By incorporating the changes, we can analyse the security of dynamic networks taking into account all the threats appearing in different network states. Our experimental results show that the Ti-HARM can effectively capture and assess the security of dynamic networks which were not possible using existing graphical security models.

In this paper, we consider ways of organizing group authentication, as well as the features of constructing the isogeny of elliptic curves. The work includes the study of isogeny graphs and their application in postquantum systems. A hierarchical group authentication scheme has been developed using transformations based on the search for isogeny of elliptic curves.

Detecting fake accounts (sybils) in online social networks (OSNs) is vital to protect OSN operators and their users from various malicious activities. Typical graph-based sybil detection (a mainstream methodology) assumes that sybils can make friends with only a limited (or small) number of honest users. However, recent evidences showed that this assumption does not hold in real-world OSNs, leading to low detection accuracy. To address this challenge, we explore users' activities to assist sybil detection. The intuition is that honest users are much more selective in choosing who to interact with than to befriend with. We first develop the social and activity network (SAN), a two-layer hyper-graph that unifies users' friendships and their activities, to fully utilize users' activities. We also propose a more practical sybil attack model, where sybils can launch both friendship attacks and activity attacks. We then design Sybil SAN to detect sybils via coupling three random walk-based algorithms on the SAN, and prove the convergence of Sybil SAN. We develop an efficient iterative algorithm to compute the detection metric for Sybil SAN, and derive the number of rounds needed to guarantee the convergence. We use "matrix perturbation theory" to bound the detection error when sybils launch many friendship attacks and activity attacks. Extensive experiments on both synthetic and real-world datasets show that Sybil SAN is highly robust against sybil attacks, and can detect sybils accurately under practical scenarios, where current state-of-art sybil defenses have low accuracy.

The problem of optimal attack path analysis is one of the hotspots in network security. Many methods are available to calculate an optimal attack path, such as Q-learning algorithm, heuristic algorithms, etc. But most of them have shortcomings. Some methods can lead to the problem of path loss, and some methods render the result un-comprehensive. This article proposes an improved Monte Carlo Graph Search algorithm (IMCGS) to calculate optimal attack paths in target network. IMCGS can avoid the problem of path loss and get comprehensive results quickly. IMCGS is divided into two steps: selection and backpropagation, which is used to calculate optimal attack paths. A weight vector containing priority, host connection number, CVSS value is proposed for every host in an attack path. This vector is used to calculate the evaluation value, the total CVSS value and the average CVSS value of a path in the target network. Result for a sample test network is presented to demonstrate the capabilities of the proposed algorithm to generate optimal attack paths in one single run. The results obtained by IMCGS show good performance and are compared with Ant Colony Optimization Algorithm (ACO) and k-zero attack graph.

In order to evaluate the network security risks and implement effective defenses in industrial control system, a risk assessment method for industrial control systems based on attack graphs is proposed. Use the concept of network security elements to translate network attacks into network state migration problems and build an industrial control network attack graph model. In view of the current subjective evaluation of expert experience, the atomic attack probability assignment method and the CVSS evaluation system were introduced to evaluate the security status of the industrial control system. Finally, taking the centralized control system of the thermal power plant as the experimental background, the case analysis is performed. The experimental results show that the method can comprehensively analyze the potential safety hazards in the industrial control system and provide basis for the safety management personnel to take effective defense measures.