Visible to the public Malware Classification with Deep Convolutional Neural Networks

TitleMalware Classification with Deep Convolutional Neural Networks
Publication TypeConference Paper
Year of Publication2018
AuthorsKalash, M., Rochan, M., Mohammed, N., Bruce, N. D. B., Wang, Y., Iqbal, F.
Conference Name2018 9th IFIP International Conference on New Technologies, Mobility and Security (NTMS)
ISBN Number979-11-88428-01-4
Keywordschallenging malware classification datasets, CNN, Computer architecture, convolution, convolutional neural networks, deep convolutional neural networks, Deep Learning, deep learning approach, deep learning framework, feedforward neural nets, Gray-scale, grayscale images, Human Behavior, image classification, invasive software, learning (artificial intelligence), Learning systems, machine learning, machine learning approaches, Malimg malware, Malware, malware binaries, malware classification, Metrics, Microsoft malware, privacy, pubcrawl, resilience, Resiliency, Support vector machines

In this paper, we propose a deep learning framework for malware classification. There has been a huge increase in the volume of malware in recent years which poses a serious security threat to financial institutions, businesses and individuals. In order to combat the proliferation of malware, new strategies are essential to quickly identify and classify malware samples so that their behavior can be analyzed. Machine learning approaches are becoming popular for classifying malware, however, most of the existing machine learning methods for malware classification use shallow learning algorithms (e.g. SVM). Recently, Convolutional Neural Networks (CNN), a deep learning approach, have shown superior performance compared to traditional learning algorithms, especially in tasks such as image classification. Motivated by this success, we propose a CNN-based architecture to classify malware samples. We convert malware binaries to grayscale images and subsequently train a CNN for classification. Experiments on two challenging malware classification datasets, Malimg and Microsoft malware, demonstrate that our method achieves better than the state-of-the-art performance. The proposed method achieves 98.52% and 99.97% accuracy on the Malimg and Microsoft datasets respectively.

Citation Keykalash_malware_2018