Visible to the public Anomaly Detection in Network Traffic Using Dynamic Graph Mining with a Sparse Autoencoder

TitleAnomaly Detection in Network Traffic Using Dynamic Graph Mining with a Sparse Autoencoder
Publication TypeConference Paper
Year of Publication2019
AuthorsJia, Guanbo, Miller, Paul, Hong, Xin, Kalutarage, Harsha, Ban, Tao
Conference Name2019 18th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/13th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE)
Date Published31 October 2019
ISBN Number978-1-7281-2777-4
Keywordsanomaly detection algorithm, autoencoder hyper-parameters, Bipartite graph, composability, compositionality, Computer crime, computer network security, contiguous time intervals, cyber physical systems, data mining, DDoS Attacks, decomposition, dynamic bipartite graph increments, dynamic graph, dynamic graph mining, dynamic network traffic, Dynamic Networks and Security, ecommerce websites, error distribution, GAAD, gaussian distribution, graph theory, Heuristic algorithms, highly structured adjacency matrix structures, increasing sparseness, learning (artificial intelligence), Matrix decomposition, Metrics, network based attacks, network emulator, Network security, network traffic, online learning algorithm, original adjacency matrix, outlier error values, pubcrawl, representative ecommerce traffic, Resiliency, resultant adjacency matrix, serious economic consequences, singular value decomposition, sparse autoencoder, Sparse matrices, telecommunication traffic, time 225.0 min

Network based attacks on ecommerce websites can have serious economic consequences. Hence, anomaly detection in dynamic network traffic has become an increasingly important research topic in recent years. This paper proposes a novel dynamic Graph and sparse Autoencoder based Anomaly Detection algorithm named GAAD. In GAAD, the network traffic over contiguous time intervals is first modelled as a series of dynamic bipartite graph increments. One mode projection is performed on each bipartite graph increment and the adjacency matrix derived. Columns of the resultant adjacency matrix are then used to train a sparse autoencoder to reconstruct it. The sum of squared errors between the reconstructed approximation and original adjacency matrix is then calculated. An online learning algorithm is then used to estimate a Gaussian distribution that models the error distribution. Outlier error values are deemed to represent anomalous traffic flows corresponding to possible attacks. In the experiment, a network emulator was used to generate representative ecommerce traffic flows over a time period of 225 minutes with five attacks injected, including SYN scans, host emulation and DDoS attacks. ROC curves were generated to investigate the influence of the autoencoder hyper-parameters. It was found that increasing the number of hidden nodes and their activation level, and increasing sparseness resulted in improved performance. Analysis showed that the sparse autoencoder was unable to encode the highly structured adjacency matrix structures associated with attacks, hence they were detected as anomalies. In contrast, SVD and variants, such as the compact matrix decomposition, were found to accurately encode the attack matrices, hence they went undetected.

Citation Keyjia_anomaly_2019