Visible to the public Biblio

Found 387 results

Filters: Keyword is Malware  [Clear All Filters]
2021-10-12
Muller, Tim, Wang, Dongxia, Sun, Jun.  2020.  Provably Robust Decisions based on Potentially Malicious Sources of Information. 2020 IEEE 33rd Computer Security Foundations Symposium (CSF). :411–424.
Sometimes a security-critical decision must be made using information provided by peers. Think of routing messages, user reports, sensor data, navigational information, blockchain updates. Attackers manifest as peers that strategically report fake information. Trust models use the provided information, and attempt to suggest the correct decision. A model that appears accurate by empirical evaluation of attacks may still be susceptible to manipulation. For a security-critical decision, it is important to take the entire attack space into account. Therefore, we define the property of robustness: the probability of deciding correctly, regardless of what information attackers provide. We introduce the notion of realisations of honesty, which allow us to bypass reasoning about specific feedback. We present two schemes that are optimally robust under the right assumptions. The “majority-rule” principle is a special case of the other scheme which is more general, named “most plausible realisations”.
2021-10-04
Alsoghyer, Samah, Almomani, Iman.  2020.  On the Effectiveness of Application Permissions for Android Ransomware Detection. 2020 6th Conference on Data Science and Machine Learning Applications (CDMA). :94–99.
Ransomware attack is posting a serious threat against Android devices and stored data that could be locked or/and encrypted by such attack. Existing solutions attempt to detect and prevent such attack by studying different features and applying various analysis mechanisms including static, dynamic or both. In this paper, recent ransomware detection solutions were investigated and compared. Moreover, a deep analysis of android permissions was conducted to identify significant android permissions that can discriminate ransomware with high accuracy before harming users' devices. Consequently, based on the outcome of this analysis, a permissions-based ransomware detection system is proposed. Different classifiers were tested to build the prediction model of this detection system. After the evaluation of the ransomware detection service, the results revealed high detection rate that reached 96.9%. Additionally, the newly permission-based android dataset constructed in this research will be made available to researchers and developers for future work.
2021-09-30
Wang, Guoqing, Zhuang, Lei, Liu, Taotao, Li, Shuxia, Yang, Sijin, Lan, Julong.  2020.  Formal Analysis and Verification of Industrial Control System Security via Timed Automata. 2020 International Conference on Internet of Things and Intelligent Applications (ITIA). :1–5.
The industrial Internet of Things (IIoT) can facilitate industrial upgrading, intelligent manufacturing, and lean production. Industrial control system (ICS) is a vital support mechanism for many key infrastructures in the IIoT. However, natural defects in the ICS network security mechanism and the susceptibility of the programmable logic controller (PLC) program to malicious attack pose a threat to the safety of national infrastructure equipment. To improve the security of the underlying equipment in ICS, a model checking method based on timed automata is proposed in this work, which can effectively model the control process and accurately simulate the system state when incorporating time factors. Formal analysis of the ICS and PLC is then conducted to formulate malware detection rules which can constrain the normal behavior of the system. The model checking tool UPPAAL is then used to verify the properties by detecting whether there is an exception in the system and determine the behavior of malware through counter-examples. The chemical reaction control system in Tennessee-Eastman process is taken as an example to carry out modeling, characterization, and verification, and can effectively detect multiple patterns of malware and propose relevant security policy recommendations.
2021-09-21
Lee, Yen-Ting, Ban, Tao, Wan, Tzu-Ling, Cheng, Shin-Ming, Isawa, Ryoichi, Takahashi, Takeshi, Inoue, Daisuke.  2020.  Cross Platform IoT-Malware Family Classification Based on Printable Strings. 2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). :775–784.
In this era of rapid network development, Internet of Things (IoT) security considerations receive a lot of attention from both the research and commercial sectors. With limited computation resource, unfriendly interface, and poor software implementation, legacy IoT devices are vulnerable to many infamous mal ware attacks. Moreover, the heterogeneity of IoT platforms and the diversity of IoT malware make the detection and classification of IoT malware even more challenging. In this paper, we propose to use printable strings as an easy-to-get but effective cross-platform feature to identify IoT malware on different IoT platforms. The discriminating capability of these strings are verified using a set of machine learning algorithms on malware family classification across different platforms. The proposed scheme shows a 99% accuracy on a large scale IoT malware dataset consisted of 120K executable fils in executable and linkable format when the training and test are done on the same platform. Meanwhile, it also achieves a 96% accuracy when training is carried out on a few popular IoT platforms but test is done on different platforms. Efficient malware prevention and mitigation solutions can be enabled based on the proposed method to prevent and mitigate IoT malware damages across different platforms.
Khan, Mamoona, Baig, Duaa, Khan, Usman Shahid, Karim, Ahmad.  2020.  Malware Classification Framework Using Convolutional Neural Network. 2020 International Conference on Cyber Warfare and Security (ICCWS). :1–7.
Cyber-security is facing a huge threat from malware and malware mass production due to its mutation factors. Classification of malware by their features is necessary for the security of information technology (IT) society. To provide security from malware, deep neural networks (DNN) can offer a superior solution for the detection and categorization of malware samples by using image classification techniques. To strengthen our ideology of malware classification through image recognition, we have experimented by comparing two perspectives of malware classification. The first perspective implements dense neural networks on binary files and the other applies deep layered convolutional neural network on malware images. The proposed model is trained to a set of malware samples, which are further distributed into 9 different families. The dataset of malware samples which is used in this paper is provided by Microsoft for Microsoft Malware Classification Challenge in 2015. The proposed model shows an accuracy of 97.80% on the provided dataset. By using the proposed model optimum classifications results can be attained.
Sartoli, Sara, Wei, Yong, Hampton, Shane.  2020.  Malware Classification Using Recurrence Plots and Deep Neural Network. 2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA). :901–906.
In this paper, we introduce a method for visualizing and classifying malware binaries. A malware binary consists of a series of data points of compiled machine codes that represent programming components. The occurrence and recurrence behavior of these components is determined by the common tasks malware samples in a particular family carry out. Thus, we view a malware binary as a series of emissions generated by an underlying stochastic process and use recurrence plots to transform malware binaries into two-dimensional texture images. We observe that recurrence plot-based malware images have significant visual similarities within the same family and are different from samples in other families. We apply deep CNN classifiers to classify malware samples. The proposed approach does not require creating malware signature or manual feature engineering. Our preliminary experimental results show that the proposed malware representation leads to a higher and more stable accuracy in comparison to directly transforming malware binaries to gray-scale images.
Chen, Chin-Wei, Su, Ching-Hung, Lee, Kun-Wei, Bair, Ping-Hao.  2020.  Malware Family Classification Using Active Learning by Learning. 2020 22nd International Conference on Advanced Communication Technology (ICACT). :590–595.
In the past few years, the malware industry has been thriving. Malware variants among the same malware family shared similar behavioural patterns or signatures reflecting their purpose. We propose an approach that combines support vector machine (SVM) classifiers and active learning by learning (ALBL) techniques to deal with insufficient labeled data in terms of the malware classification tasks. The proposed approach is evaluated with the malware family dataset from Microsoft Malware Classification Challenge (BIG 2015) on Kaggle. The results show that ALBL techniques can effectively boost the performance of our machine learning models and improve the quality of labeled samples.
Lin, Kuang-Yao, Huang, Wei-Ren.  2020.  Using Federated Learning on Malware Classification. 2020 22nd International Conference on Advanced Communication Technology (ICACT). :585–589.
In recent years, everything has been more and more systematic, and it would generate many cyber security issues. One of the most important of these is the malware. Modern malware has switched to a high-growth phase. According to the AV-TEST Institute showed that there are over 350,000 new malicious programs (malware) and potentially unwanted applications (PUA) be registered every day. This threat was presented and discussed in the present paper. In addition, we also considered data privacy by using federated learning. Feature extraction can be performed based on malware. The proposed method achieves very high accuracy ($\approx$0.9167) on the dataset provided by VirusTotal.
Snow, Elijah, Alam, Mahbubul, Glandon, Alexander, Iftekharuddin, Khan.  2020.  End-to-End Multimodel Deep Learning for Malware Classification. 2020 International Joint Conference on Neural Networks (IJCNN). :1–7.
Malicious software (malware) is designed to cause unwanted or destructive effects on computers. Since modern society is dependent on computers to function, malware has the potential to do untold damage. Therefore, developing techniques to effectively combat malware is critical. With the rise in popularity of polymorphic malware, conventional anti-malware techniques fail to keep up with the rate of emergence of new malware. This poses a major challenge towards developing an efficient and robust malware detection technique. One approach to overcoming this challenge is to classify new malware among families of known malware. Several machine learning methods have been proposed for solving the malware classification problem. However, these techniques rely on hand-engineered features extracted from malware data which may not be effective for classifying new malware. Deep learning models have shown paramount success for solving various classification tasks such as image and text classification. Recent deep learning techniques are capable of extracting features directly from the input data. Consequently, this paper proposes an end-to-end deep learning framework for multimodels (henceforth, multimodel learning) to solve the challenging malware classification problem. The proposed model utilizes three different deep neural network architectures to jointly learn meaningful features from different attributes of the malware data. End-to-end learning optimizes all processing steps simultaneously, which improves model accuracy and generalizability. The performance of the model is tested with the widely used and publicly available Microsoft Malware Challenge Dataset and is compared with the state-of-the-art deep learning-based malware classification pipeline. Our results suggest that the proposed model achieves comparable performance to the state-of-the-art methods while offering faster training using end-to-end multimodel learning.
Brezinski, Kenneth, Ferens, Ken.  2020.  Complexity-Based Convolutional Neural Network for Malware Classification. 2020 International Conference on Computational Science and Computational Intelligence (CSCI). :1–9.
Malware classification remains at the forefront of ongoing research as the prevalence of metamorphic malware introduces new challenges to anti-virus vendors and firms alike. One approach to malware classification is Static Analysis - a form of analysis which does not require malware to be executed before classification can be performed. For this reason, a lightweight classifier based on the features of a malware binary is preferred, with relatively low computational overhead. In this work a modified convolutional neural network (CNN) architecture was deployed which integrated a complexity-based evaluation based on box-counting. This was implemented by setting up max-pooling layers in parallel, and then extracting the fractal dimension using a polyscalar relationship based on the resolution of the measurement scale and the number of elements of a malware image covered in the measurement under consideration. To test the robustness and efficacy of our approach we trained and tested on over 9300 malware binaries from 25 unique malware families. This work was compared to other award-winning image recognition models, and results showed categorical accuracy in excess of 96.54%.
Ghanem, Sahar M., Aldeen, Donia Naief Saad.  2020.  AltCC: Alternating Clustering and Classification for Batch Analysis of Malware Behavior. 2020 International Symposium on Networks, Computers and Communications (ISNCC). :1–6.
The most common goal of malware analysis is to determine if a given binary is malware or benign. Another objective is similarity analysis of malware binaries to understand how new samples differ from known ones. Similarity analysis helps to analyze the malware with respect to those already analyzed and guides the discovery of novel aspects that should be analyzed more in depth. In this work, we are concerned with similarities and differences detection of malware binaries. Thousands of malware are created every day and machine learning is an indispensable tool for its analysis. Previous work has studied clustering and classification as competing paradigms. However, in this work, a malware similarity analysis technique (AltCC) is proposed that alternates the use of clustering and classification. In addition it assumes the malware are not available all at once but processed in batches. Initially, clustering is applied to the first batch to group similar binaries into novel malware classes. Then, the discovered classes are used to train a classifier. For the following batches, the classifier is used to decide if a new binary classifies to a known class or otherwise unclassified. The unclassified binaries are clustered and the process repeats. Malware clustering (i.e. labeling) may entail further human expert analysis but dramatically reduces the effort. The effectiveness of AltCC is studied using a dataset of 29,661 malware binaries that represent malware received in six consecutive days/batches. When KMeans is used to label the dataset all at once and its labeling is compared to AltCC's, the adjusted-rand-index scores 0.71.
Barr, Joseph R., Shaw, Peter, Abu-Khzam, Faisal N., Yu, Sheng, Yin, Heng, Thatcher, Tyler.  2020.  Combinatorial Code Classification Amp; Vulnerability Rating. 2020 Second International Conference on Transdisciplinary AI (TransAI). :80–83.
Empirical analysis of source code of Android Fluoride Bluetooth stack demonstrates a novel approach of classification of source code and rating for vulnerability. A workflow that combines deep learning and combinatorial techniques with a straightforward random forest regression is presented. Two kinds of embedding are used: code2vec and LSTM, resulting in a distance matrix that is interpreted as a (combinatorial) graph whose vertices represent code components, functions and methods. Cluster Editing is then applied to partition the vertex set of the graph into subsets representing nearly complete subgraphs. Finally, the vectors representing the components are used as features to model the components for vulnerability risk.
Li, Mingxuan, Lv, Shichao, Shi, Zhiqiang.  2020.  Malware Detection for Industrial Internet Based on GAN. 2020 IEEE International Conference on Information Technology,Big Data and Artificial Intelligence (ICIBA). 1:475–481.
This thesis focuses on the detection of malware in industrial Internet. The basic flow of the detection of malware contains feature extraction and sample identification. API graph can effectively represent the behavior information of malware. However, due to the high algorithm complexity of solving the problem of subgraph isomorphism, the efficiency of analysis based on graph structure feature is low. Due to the different scales of API graph of different malicious codes, the API graph needs to be normalized. Considering the difficulties of sample collection and manual marking, it is necessary to expand the number of malware samples in industrial Internet. This paper proposes a method that combines PageRank with TF-IDF to process the API graph. Besides, this paper proposes a method to construct the adversarial samples of malwares based on GAN.
Petrenko, Sergei A., Petrenko, Alexey S., Makoveichuk, Krystina A., Olifirov, Alexander V..  2020.  "Digital Bombs" Neutralization Method. 2020 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus). :446–451.
The article discusses new models and methods for timely identification and blocking of malicious code of critically important information infrastructure based on static and dynamic analysis of executable program codes. A two-stage method for detecting malicious code in the executable program codes (the so-called "digital bombs") is described. The first step of the method is to build the initial program model in the form of a control graph, the construction is carried out at the stage of static analysis of the program. The article discusses the purpose, features and construction criteria of an ordered control graph. The second step of the method is to embed control points in the program's executable code for organizing control of the possible behavior of the program using a specially designed recognition automaton - an automaton of dynamic control. Structural criteria for the completeness of the functional control of the subprogram are given. The practical implementation of the proposed models and methods was completed and presented in a special instrumental complex IRIDA.
Yan, Fan, Liu, Jia, Gu, Liang, Chen, Zelong.  2020.  A Semi-Supervised Learning Scheme to Detect Unknown DGA Domain Names Based on Graph Analysis. 2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). :1578–1583.
A large amount of malware families use the domain generation algorithms (DGA) to randomly generate a large amount of domain names. It is a good way to bypass conventional blacklists of domain names, because we cannot predict which of the randomly generated domain names are selected for command and control (C&C) communications. An effective approach for detecting known DGA families is to investigate the malware with reverse engineering to find the adopted generation algorithms. As reverse engineering cannot handle the variants of DGA families, some researches leverage supervised learning to find new variants. However, the explainability of supervised learning is low and cannot find previously unseen DGA families. In this paper, we propose a graph-based semi-supervised learning scheme to track the evolution of known DGA families and find previously unseen DGA families. With a domain relation graph, we can clearly figure out how new variants relate to known DGA domain names, which induces better explainability. We deployed the proposed scheme on real network scenarios and show that the proposed scheme can not only comprehensively and precisely find known DGA families, but also can find new DGA families which have not seen before.
Yang, Ping, Shu, Hui, Kang, Fei, Bu, Wenjuan.  2020.  Automatically Generating Malware Summary Using Semantic Behavior Graphs (SBGs). 2020 Information Communication Technologies Conference (ICTC). :282–291.
In malware behavior analysis, there are limitations in the analysis method of control flow and data flow. Researchers analyzed data flow by dynamic taint analysis tools, however, it cost a lot. In this paper, we proposed a method of generating malware summary based on semantic behavior graphs (SBGs, Semantic Behavior Graphs) to address this issue. In this paper, we considered various situation where behaviors be capable of being associated, thus an algorithm of generating semantic behavior graphs was given firstly. Semantic behavior graphs are composed of behavior nodes and associated data edges. Then, we extracted behaviors and logical relationships between behaviors from semantic behavior graphs, and finally generated a summary of malware behaviors with true intension. Experimental results showed that our approach can effectively identify and describe malicious behaviors and generate accurate behavior summary.
Chai, Yuhan, Qiu, Jing, Su, Shen, Zhu, Chunsheng, Yin, Lihua, Tian, Zhihong.  2020.  LGMal: A Joint Framework Based on Local and Global Features for Malware Detection. 2020 International Wireless Communications and Mobile Computing (IWCMC). :463–468.
With the gradual advancement of smart city construction, various information systems have been widely used in smart cities. In order to obtain huge economic benefits, criminals frequently invade the information system, which leads to the increase of malware. Malware attacks not only seriously infringe on the legitimate rights and interests of users, but also cause huge economic losses. Signature-based malware detection algorithms can only detect known malware, and are susceptible to evasion techniques such as binary obfuscation. Behavior-based malware detection methods can solve this problem well. Although there are some malware behavior analysis works, they may ignore semantic information in the malware API call sequence. In this paper, we design a joint framework based on local and global features for malware detection to solve the problem of network security of smart cities, called LGMal, which combines the stacked convolutional neural network and graph convolutional networks. Specially, the stacked convolutional neural network is used to learn API call sequence information to capture local semantic features and the graph convolutional networks is used to learn API call semantic graph structure information to capture global semantic features. Experiments on Alibaba Cloud Security Malware Detection datasets show that the joint framework gets better results. The experimental results show that the precision is 87.76%, the recall is 88.08%, and the F1-measure is 87.79%. We hope this paper can provide a useful way for malware detection and protect the network security of smart city.
Wang, Duanyi, Shu, Hui, Kang, Fei, Bu, Wenjuan.  2020.  A Malware Similarity Analysis Method Based on Network Control Structure Graph. 2020 IEEE 11th International Conference on Software Engineering and Service Science (ICSESS). :295–300.
Recently, graph-based malware similarity analysis has been widely used in the field of malware detection. However, the wide application of code obfuscation, polymorphism, and deformation changes the structure of malicious code, which brings great challenges to the malware similarity analysis. To solve these problems, in this paper, we present a new approach to malware similarity analysis based on the network control structure graph (NCSG). This method analyzed the behavior of malware by application program interface (API) association and constructed NCSG. The graph could reflect the command-and-control(C&C) logic of malware. Therefore, it can resist the interference of code obfuscation technology. The structural features extracted from NCSG will be used as the basis of similarity analysis for training the detection model. Finally, we tested the dataset constructed from five known malware family samples, and the experimental results showed that the accuracy of this method for malware variation analysis reached 92.75%. In conclusion, the malware similarity analysis based on NCSG has a strong application value for identifying the same family of malware.
bin Asad, Ashub, Mansur, Raiyan, Zawad, Safir, Evan, Nahian, Hossain, Muhammad Iqbal.  2020.  Analysis of Malware Prediction Based on Infection Rate Using Machine Learning Techniques. 2020 IEEE Region 10 Symposium (TENSYMP). :706–709.
In this modern, technological age, the internet has been adopted by the masses. And with it, the danger of malicious attacks by cybercriminals have increased. These attacks are done via Malware, and have resulted in billions of dollars of financial damage. This makes the prevention of malicious attacks an essential part of the battle against cybercrime. In this paper, we are applying machine learning algorithms to predict the malware infection rates of computers based on its features. We are using supervised machine learning algorithms and gradient boosting algorithms. We have collected a publicly available dataset, which was divided into two parts, one being the training set, and the other will be the testing set. After conducting four different experiments using the aforementioned algorithms, it has been discovered that LightGBM is the best model with an AUC Score of 0.73926.
Vurdelja, Igor, Blažić, Ivan, Bojić, Dragan, Drašković, Dražen.  2020.  A framework for automated dynamic malware analysis for Linux. 2020 28th Telecommunications Forum (℡FOR). :1–4.
Development of malware protection tools requires a more advanced test environment comparing to safe software. This kind of development includes a safe execution of many malware samples in order to evaluate the protective power of the tool. The host machine needs to be protected from the harmful effects of malware samples and provide a realistic simulation of the execution environment. In this paper, a framework for automated malware analysis on Linux is presented. Different types of malware analysis methods are discussed, as well as the properties of a good framework for dynamic malware analysis.
Jin, Xiang, Xing, Xiaofei, Elahi, Haroon, Wang, Guojun, Jiang, Hai.  2020.  A Malware Detection Approach Using Malware Images and Autoencoders. 2020 IEEE 17th International Conference on Mobile Ad Hoc and Sensor Systems (MASS). :1–6.
Most machine learning-based malware detection systems use various supervised learning methods to classify different instances of software as benign or malicious. This approach provides no information regarding the behavioral characteristics of malware. It also requires a large amount of training data and is prone to labeling difficulties and can reduce accuracy due to redundant training data. Therefore, we propose a malware detection method based on deep learning, which uses malware images and a set of autoencoders to detect malware. The method is to design an autoencoder to learn the functional characteristics of malware, and then to observe the reconstruction error of autoencoder to realize the classification and detection of malware and benign software. The proposed approach achieves 93% accuracy and comparatively better F1-score values while detecting malware and needs little training data when compared with traditional malware detection systems.
Ramadhan, Beno, Purwanto, Yudha, Ruriawan, Muhammad Faris.  2020.  Forensic Malware Identification Using Naive Bayes Method. 2020 International Conference on Information Technology Systems and Innovation (ICITSI). :1–7.
Malware is a kind of software that, if installed on a malware victim's device, might carry malicious actions. The malicious actions might be data theft, system failure, or denial of service. Malware analysis is a process to identify whether a piece of software is a malware or not. However, with the advancement of malware technologies, there are several evasion techniques that could be implemented by malware developers to prevent analysis, such as polymorphic and oligomorphic. Therefore, this research proposes an automatic malware detection system. In the system, the malware characteristics data were obtained through both static and dynamic analysis processes. Data from the analysis process were classified using Naive Bayes algorithm to identify whether the software is a malware or not. The process of identifying malware and benign files using the Naive Bayes machine learning method has an accuracy value of 93 percent for the detection process using static characteristics and 85 percent for detection through dynamic characteristics.
Patil, Rajvardhan, Deng, Wei.  2020.  Malware Analysis using Machine Learning and Deep Learning techniques. 2020 SoutheastCon. 2:1–7.
In this era, where the volume and diversity of malware is rising exponentially, new techniques need to be employed for faster and accurate identification of the malwares. Manual heuristic inspection of malware analysis are neither effective in detecting new malware, nor efficient as they fail to keep up with the high spreading rate of malware. Machine learning approaches have therefore gained momentum. They have been used to automate static and dynamic analysis investigation where malware having similar behavior are clustered together, and based on the proximity unknown malwares get classified to their respective families. Although many such research efforts have been conducted where data-mining and machine-learning techniques have been applied, in this paper we show how the accuracy can further be improved using deep learning networks. As deep learning offers superior classification by constructing neural networks with a higher number of potentially diverse layers it leads to improvement in automatic detection and classification of the malware variants.In this research, we present a framework which extracts various feature-sets such as system calls, operational codes, sections, and byte codes from the malware files. In the experimental and result section, we compare the accuracy obtained from each of these features and demonstrate that feature vector for system calls yields the highest accuracy. The paper concludes by showing how deep learning approach performs better than the traditional shallow machine learning approaches.
Kartel, Anastasia, Novikova, Evgenia, Volosiuk, Aleksandr.  2020.  Analysis of Visualization Techniques for Malware Detection. 2020 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus). :337–340.
Due to the steady growth of various sophisticated types of malware, different malware analysis systems are becoming more and more demanded. While there are various automatic approaches available to identify and detect malware, the malware analysis is still time-consuming process. The visualization-driven techniques may significantly increase the efficiency of the malware analysis process by involving human visual system which is a powerful pattern seeker. In this paper the authors reviewed different visualization methods, examined their features and tasks solved with their help. The paper presents the most commonly used approaches and discusses open challenges in malware visual analytics.
Walker, Aaron, Sengupta, Shamik.  2020.  Malware Family Fingerprinting Through Behavioral Analysis. 2020 IEEE International Conference on Intelligence and Security Informatics (ISI). :1–5.
Signature-based malware detection is not always effective at detecting polymorphic variants of known malware. Malware signatures are devised to counter known threats, which also limits efficacy against new forms of malware. However, existing signatures do present the ability to classify malware based upon known malicious behavior which occurs on a victim computer. In this paper we present a method of classifying malware by family type through behavioral analysis, where the frequency of system function calls is used to fingerprint the actions of specific malware families. This in turn allows us to demonstrate a machine learning classifier which is capable of distinguishing malware by family affiliation with high accuracy.