Visible to the public Data Analytics-Enabled Intrusion Detection: Evaluations of ToNİoT Linux Datasets

TitleData Analytics-Enabled Intrusion Detection: Evaluations of ToNİoT Linux Datasets
Publication TypeConference Paper
Year of Publication2020
AuthorsMoustafa, Nour, Ahmed, Mohiuddin, Ahmed, Sherif
Conference Name2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)
Date Publisheddec
Keywordsartificial intelligence, composability, Computer architecture, Cyber Attacks, data privacy, dataset, Human Behavior, Internet of Things, Intrusion detection, Linux, Linux Operating System Security, Linux Systems, Metrics, Operating systems, privacy, pubcrawl, Resiliency, security, virtualization privacy
AbstractWith the widespread of Artificial Intelligence (AI)-enabled security applications, there is a need for collecting heterogeneous and scalable data sources for effectively evaluating the performances of security applications. This paper presents the description of new datasets, named ToNİoT datasets that include distributed data sources collected from Telemetry datasets of Internet of Things (IoT) services, Operating systems datasets of Windows and Linux, and datasets of Network traffic. The paper aims to describe the new testbed architecture used to collect Linux datasets from audit traces of hard disk, memory and process. The architecture was designed in three distributed layers of edge, fog, and cloud. The edge layer comprises IoT and network systems, the fog layer includes virtual machines and gateways, and the cloud layer includes data analytics and visualization tools connected with the other two layers. The layers were programmatically controlled using Software-Defined Network (SDN) and Network-Function Virtualization (NFV) using the VMware NSX and vCloud NFV platform. The Linux ToNİoT datasets would be used to train and validate various new federated and distributed AI-enabled security solutions such as intrusion detection, threat intelligence, privacy preservation and digital forensics. Various Data analytical and machine learning methods are employed to determine the fidelity of the datasets in terms of examining feature engineering, statistics of legitimate and security events, and reliability of security events. The datasets can be publicly accessed from [1].
Citation Keymoustafa_data_2020