Visible to the public Biblio

Found 278 results

Filters: Keyword is data mining  [Clear All Filters]
2021-06-30
Zhang, Wenrui.  2020.  Application of Attention Model Hybrid Guiding based on Artificial Intelligence in the Course of Intelligent Architecture History. 2020 3rd International Conference on Intelligent Sustainable Systems (ICISS). :59—62.
Application of the attention model hybrid building based on the artificial intelligence in the course of the intelligent architecture history is studied in this article. A Hadoop distributed architecture using big data processing technology which combines basic building information with the building energy consumption data for the data mining research methods, and conduct a preliminary design of a Hadoop-based public building energy consumption data mining system. The principles of the proposed model were summarized. At first, the intelligent firewall processes the decision data faster, when the harmful information invades. The intelligent firewall can monitor and also intercept the harmful information in a timelier manner. Secondly, develop a problem data processing plan, delete and identify different types of problem data, and supplement the deleted problem data according to the rules obtained by data mining. The experimental results have reflected the efficiency of the proposed model.
2021-06-24
Nilă, Constantin, Patriciu, Victor.  2020.  Taking advantage of unsupervised learning in incident response. 2020 12th International Conference on Electronics, Computers and Artificial Intelligence (ECAI). :1–6.
This paper looks at new ways to improve the necessary time for incident response triage operations. By employing unsupervised K-means, enhanced by both manual and automated feature extraction techniques, the incident response team can quickly and decisively extrapolate malicious web requests that concluded to the investigated exploitation. More precisely, we evaluated the benefits of different visualization enhancing methods that can improve feature selection and other dimensionality reduction techniques. Furthermore, early tests of the gross framework have shown that the necessary time for triage is diminished, more so if a hybrid multi-model is employed. Our case study revolved around the need for unsupervised classification of unknown web access logs. However, the demonstrated principals may be considered for other applications of machine learning in the cybersecurity domain.
Stöckle, Patrick, Grobauer, Bernd, Pretschner, Alexander.  2020.  Automated Implementation of Windows-related Security-Configuration Guides. 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE). :598—610.
Hardening is the process of configuring IT systems to ensure the security of the systems' components and data they process or store. The complexity of contemporary IT infrastructures, however, renders manual security hardening and maintenance a daunting task. In many organizations, security-configuration guides expressed in the SCAP (Security Content Automation Protocol) are used as a basis for hardening, but these guides by themselves provide no means for automatically implementing the required configurations. In this paper, we propose an approach to automatically extract the relevant information from publicly available security-configuration guides for Windows operating systems using natural language processing. In a second step, the extracted information is verified using the information of available settings stored in the Windows Administrative Template files, in which the majority of Windows configuration settings is defined. We show that our implementation of this approach can extract and implement 83% of the rules without any manual effort and 96% with minimal manual effort. Furthermore, we conduct a study with 12 state-of-the-art guides consisting of 2014 rules with automatic checks and show that our tooling can implement at least 97% of them correctly. We have thus significantly reduced the effort of securing systems based on existing security-configuration guides. In many organizations, security-configuration guides expressed in the SCAP (Security Content Automation Protocol) are used as a basis for hardening, but these guides by themselves provide no means for automatically implementing the required configurations. In this paper, we propose an approach to automatically extract the relevant information from publicly available security-configuration guides for Windows operating systems using natural language processing. In a second step, the extracted information is verified using the information of available settings stored in the Windows Administrative Template files, in which the majority of Windows configuration settings is defined. We show that our implementation of this approach can extract and implement 83% of the rules without any manual effort and 96% with minimal manual effort. Furthermore, we conduct a study with 12 state-of-the-art guides consisting of 2014 rules with automatic checks and show that our tooling can implement at least 97% of them correctly. We have thus significantly reduced the effort of securing systems based on existing security-configuration guides. In this paper, we propose an approach to automatically extract the relevant information from publicly available security-configuration guides for Windows operating systems using natural language processing. In a second step, the extracted information is verified using the information of available settings stored in the Windows Administrative Template files, in which the majority of Windows configuration settings is defined. We show that our implementation of this approach can extract and implement 83% of the rules without any manual effort and 96% with minimal manual effort. Furthermore, we conduct a study with 12 state-of-the-art guides consisting of 2014 rules with automatic checks and show that our tooling can implement at least 97% of them correctly. We have thus significantly reduced the effort of securing systems based on existing security-configuration guides. We show that our implementation of this approach can extract and implement 83% of the rules without any manual effort and 96% with minimal manual effort. Furthermore, we conduct a study with 12 state-of-the-art guides consisting of 2014 rules with automatic checks and show that our tooling can implement at least 97% of them correctly. We have thus significantly reduced the effort of securing systems based on existing security-configuration guides.
2021-06-01
Wang, Qi, Zhao, Weiliang, Yang, Jian, Wu, Jia, Zhou, Chuan, Xing, Qianli.  2020.  AtNE-Trust: Attributed Trust Network Embedding for Trust Prediction in Online Social Networks. 2020 IEEE International Conference on Data Mining (ICDM). :601–610.
Trust relationship prediction among people provides valuable supports for decision making, information dissemination, and product promotion in online social networks. Network embedding has achieved promising performance for link prediction by learning node representations that encode intrinsic network structures. However, most of the existing network embedding solutions cannot effectively capture the properties of a trust network that has directed edges and nodes with in/out links. Furthermore, there usually exist rich user attributes in trust networks, such as ratings, reviews, and the rated/reviewed items, which may exert significant impacts on the formation of trust relationships. It is still lacking a network embedding-based method that can adequately integrate these properties for trust prediction. In this work, we develop an AtNE-Trust model to address these issues. We firstly capture user embedding from both the trust network structures and user attributes. Then we design a deep multi-view representation learning module to further mine and fuse the obtained user embedding. Finally, a trust evaluation module is developed to predict the trust relationships between users. Representation learning and trust evaluation are optimized together to capture high-quality user embedding and make accurate predictions simultaneously. A set of experiments against the real-world datasets demonstrates the effectiveness of the proposed approach.
2021-05-18
Iorga, Denis, Corlătescu, Dragos, Grigorescu, Octavian, Săndescu, Cristian, Dascălu, Mihai, Rughiniş, Razvan.  2020.  Early Detection of Vulnerabilities from News Websites using Machine Learning Models. 2020 19th RoEduNet Conference: Networking in Education and Research (RoEduNet). :1–6.
The drawbacks of traditional methods of cybernetic vulnerability detection relate to the required time to identify new threats, to register them in the Common Vulnerabilities and Exposures (CVE) records, and to score them with the Common Vulnerabilities Scoring System (CVSS). These problems can be mitigated by early vulnerability detection systems relying on social media and open-source data. This paper presents a model that aims to identify emerging cybernetic vulnerabilities in cybersecurity news articles, as part of a system for automatic detection of early cybernetic threats using Open Source Intelligence (OSINT). Three machine learning models were trained on a novel dataset of 1000 labeled news articles to create a strong baseline for classifying cybersecurity articles as relevant (i.e., introducing new security threats), or irrelevant: Support Vector Machines, a Multinomial Naïve Bayes classifier, and a finetuned BERT model. The BERT model obtained the best performance with a mean accuracy of 88.45% on the test dataset. Our experiments support the conclusion that Natural Language Processing (NLP) models are an appropriate choice for early vulnerability detection systems in order to extract relevant information from cybersecurity news articles.
2021-05-13
Shu, Fei, Chen, Shuting, Li, Feng, Zhang, JianYe, Chen, Jia.  2020.  Research and implementation of network attack and defense countermeasure technology based on artificial intelligence technology. 2020 IEEE 5th Information Technology and Mechatronics Engineering Conference (ITOEC). :475—478.
Using artificial intelligence technology to help network security has become a major trend. At present, major countries in the world have successively invested R & D force in the attack and defense of automatic network based on artificial intelligence. The U.S. Navy, the U.S. air force, and the DOD strategic capabilities office have invested heavily in the development of artificial intelligence network defense systems. DARPA launched the network security challenge (CGC) to promote the development of automatic attack system based on artificial intelligence. In the 2016 Defcon final, mayhem (the champion of CGC in 2014), an automatic attack team, participated in the competition with 14 human teams and once defeated two human teams, indicating that the automatic attack method generated by artificial intelligence system can scan system defects and find loopholes faster and more effectively than human beings. Japan's defense ministry also announced recently that in order to strengthen the ability to respond to network attacks, it will introduce artificial intelligence technology into the information communication network defense system of Japan's self defense force. It can be predicted that the deepening application of artificial intelligence in the field of network attack and defense may bring about revolutionary changes and increase the imbalance of the strategic strength of cyberspace in various countries. Therefore, it is necessary to systematically investigate the current situation of network attack and defense based on artificial intelligence at home and abroad, comprehensively analyze the development trend of relevant technologies at home and abroad, deeply analyze the development outline and specification of artificial intelligence attack and defense around the world, and refine the application status and future prospects of artificial intelligence attack and defense, so as to promote the development of artificial intelligence attack and Defense Technology in China and protect the core interests of cyberspace, of great significance
2021-05-05
Hossain, Md. Turab, Hossain, Md. Shohrab, Narman, Husnu S..  2020.  Detection of Undesired Events on Real-World SCADA Power System through Process Monitoring. 2020 11th IEEE Annual Ubiquitous Computing, Electronics Mobile Communication Conference (UEMCON). :0779—0785.
A Supervisory Control and Data Acquisition (SCADA) system used in controlling or monitoring purpose in industrial process automation system is the process of collecting data from instruments and sensors located at remote sites and transmitting data at a central site. Most of the existing works on SCADA system focused on simulation-based study which cannot always mimic the real world situations. We propose a novel methodology that analyzes SCADA logs on offline basis and helps to detect process-related threats. This threat takes place when an attacker performs malicious actions after gaining user access. We conduct our experiments on a real-life SCADA system of a Power transmission utility. Our proposed methodology will automate the analysis of SCADA logs and systemically identify undesired events. Moreover, it will help to analyse process-related threats caused by user activity. Several test study suggest that our approach is powerful in detecting undesired events that might caused by possible malicious occurrence.
Pawar, Shrikant, Stanam, Aditya.  2020.  Scalable, Reliable and Robust Data Mining Infrastructures. 2020 Fourth World Conference on Smart Trends in Systems, Security and Sustainability (WorldS4). :123—125.

Mining of data is used to analyze facts to discover formerly unknown patterns, classifying and grouping the records. There are several crucial scalable statistics mining platforms that have been developed in latest years. RapidMiner is a famous open source software which can be used for advanced analytics, Weka and Orange are important tools of machine learning for classifying patterns with techniques of clustering and regression, whilst Knime is often used for facts preprocessing like information extraction, transformation and loading. This article encapsulates the most important and robust platforms.

2021-04-27
Kotturu, P. K., Kumar, A..  2020.  Data Mining Visualization with the Impact of Nature Inspired Algorithms in Big Data. 2020 4th International Conference on Trends in Electronics and Informatics (ICOEI)(48184). :664—668.

Data mining visualization is an important aspect of big data visualization and analysis. The impact of the nature-inspired algorithm along with the impact of computing traditions for the complete visualization of the storage and data communication needs have been studied. This paper also explores the possibilities of the hybridization of data mining in terms of association of cloud computing. It also explores the data analytical view in the exploration of these approaches in terms of data storage in big data. Based on these aspects the methodological advancement along with the problem statements has been analyzed. This will help in the exploration of computational capability along with the new insights in this domain.

Fu, Y., Tong, S., Guo, X., Cheng, L., Zhang, Y., Feng, D..  2020.  Improving the Effectiveness of Grey-box Fuzzing By Extracting Program Information. 2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). :434–441.
Fuzzing has been widely adopted as an effective techniques to detect vulnerabilities in softwares. However, existing fuzzers suffer from the problems of generating excessive test inputs that either cannot pass input validation or are ineffective in exploring unvisited regions in the program under test (PUT). To tackle these problems, we propose a greybox fuzzer called MuFuzzer based on AFL, which incorporates two heuristics that optimize seed selection and automatically extract input formatting information from the PUT to increase the chance of generating valid test inputs, respectively. In particular, the first heuristic collects the branch coverage and execution information during a fuzz session, and utilizes such information to guide fuzzing tools in selecting seeds that are fast to execute, small in size, and more importantly, more likely to explore new behaviors of the PUT for subsequent fuzzing activities. The second heuristic automatically identifies string comparison operations that the PUT uses for input validation, and establishes a dictionary with string constants from these operations to help fuzzers generate test inputs that have higher chances to pass input validation. We have evaluated the performance of MuFuzzer, in terms of code coverage and bug detection, using a set of realistic programs and the LAVA-M test bench. Experiment results demonstrate that MuFuzzer is able to achieve higher code coverage and better or comparative bug detection performance than state-of-the-art fuzzers.
Wang, Y., Guo, S., Wu, J., Wang, H. H..  2020.  Construction of Audit Internal Control System Based on Online Big Data Mining and Decentralized Model. 2020 Fourth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC). :623–626.
Construction of the audit internal control system based on the online big data mining and decentralized model is done in this paper. How to integrate the novel technologies to internal control is the attracting task. IT audit is built on the information system and is independent of the information system itself. Application of the IT audit in enterprises can provide a guarantee for the security of the information system that can give an objective evaluation of the investment. This paper integrates the online big data mining and decentralized model to construct an efficient system. Association discovery is also called a data link. It uses similarity functions, such as the Euclidean distance, edit distance, cosine distance, Jeckard function, etc., to establish association relationships between data entities. These parameters are considered for comprehensive analysis.
Zhou, X..  2020.  Improvement of information System Audit to Deal With Network Information Security. 2020 International Conference on Communications, Information System and Computer Engineering (CISCE). :93–96.
With the rapid development of information technology and the increasing popularity of information and communication technology, the information age has come. Enterprises must adapt to changes in the times, introduce network and computer technologies in a timely manner, and establish more efficient and reasonable information systems and platforms. Large-scale information system construction is inseparable from related audit work, and network security risks have become an important part of information system audit concerns. This paper analyzes the objectives and contents of information system audits under the background of network information security through theoretical analysis, and on this basis, proposes how the IS audit work will be carried out.
2021-04-08
Yang, Z., Sun, Q., Zhang, Y., Zhu, L., Ji, W..  2020.  Inference of Suspicious Co-Visitation and Co-Rating Behaviors and Abnormality Forensics for Recommender Systems. IEEE Transactions on Information Forensics and Security. 15:2766—2781.
The pervasiveness of personalized collaborative recommender systems has shown the powerful capability in a wide range of E-commerce services such as Amazon, TripAdvisor, Yelp, etc. However, fundamental vulnerabilities of collaborative recommender systems leave space for malicious users to affect the recommendation results as the attackers desire. A vast majority of existing detection methods assume certain properties of malicious attacks are given in advance. In reality, improving the detection performance is usually constrained due to the challenging issues: (a) various types of malicious attacks coexist, (b) limited representations of malicious attack behaviors, and (c) practical evidences for exploring and spotting anomalies on real-world data are scarce. In this paper, we investigate a unified detection framework in an eye for an eye manner without being bothered by the details of the attacks. Firstly, co-visitation and co-rating graphs are constructed using association rules. Then, attribute representations of nodes are empirically developed from the perspectives of linkage pattern, structure-based property and inherent association of nodes. Finally, both attribute information and connective coherence of graph are combined in order to infer suspicious nodes. Extensive experiments on both synthetic and real-world data demonstrate the effectiveness of the proposed detection approach compared with competing benchmarks. Additionally, abnormality forensics metrics including distribution of rating intention, time aggregation of suspicious ratings, degree distributions before as well as after removing suspicious nodes and time series analysis of historical ratings, are provided so as to discover interesting findings such as suspicious nodes (items or ratings) on real-world data.
Westland, T., Niu, N., Jha, R., Kapp, D., Kebede, T..  2020.  Relating the Empirical Foundations of Attack Generation and Vulnerability Discovery. 2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science (IRI). :37–44.
Automatically generating exploits for attacks receives much attention in security testing and auditing. However, little is known about the continuous effect of automatic attack generation and detection. In this paper, we develop an analytic model to understand the cost-benefit tradeoffs in light of the process of vulnerability discovery. We develop a three-phased model, suggesting that the cumulative malware detection has a productive period before the rate of gain flattens. As the detection mechanisms co-evolve, the gain will likely increase. We evaluate our analytic model by using an anti-virus tool to detect the thousands of Trojans automatically created. The anti-virus scanning results over five months show the validity of the model and point out future research directions.
2021-03-30
Ganfure, G. O., Wu, C.-F., Chang, Y.-H., Shih, W.-K..  2020.  DeepGuard: Deep Generative User-behavior Analytics for Ransomware Detection. 2020 IEEE International Conference on Intelligence and Security Informatics (ISI). :1—6.

In the last couple of years, the move to cyberspace provides a fertile environment for ransomware criminals like ever before. Notably, since the introduction of WannaCry, numerous ransomware detection solution has been proposed. However, the ransomware incidence report shows that most organizations impacted by ransomware are running state of the art ransomware detection tools. Hence, an alternative solution is an urgent requirement as the existing detection models are not sufficient to spot emerging ransomware treat. With this motivation, our work proposes "DeepGuard," a novel concept of modeling user behavior for ransomware detection. The main idea is to log the file-interaction pattern of typical user activity and pass it through deep generative autoencoder architecture to recreate the input. With sufficient training data, the model can learn how to reconstruct typical user activity (or input) with minimal reconstruction error. Hence, by applying the three-sigma limit rule on the model's output, DeepGuard can distinguish the ransomware activity from the user activity. The experiment result shows that DeepGuard effectively detects a variant class of ransomware with minimal false-positive rates. Overall, modeling the attack detection with user-behavior permits the proposed strategy to have deep visibility of various ransomware families.

Zhang, R., Cao, Z., Wu, K..  2020.  Tracing and detection of ICS Anomalies Based on Causality Mutations. 2020 IEEE 5th Information Technology and Mechatronics Engineering Conference (ITOEC). :511—517.

The algorithm of causal anomaly detection in industrial control physics is proposed to determine the normal cloud line of industrial control system so as to accurately detect the anomaly. In this paper, The causal modeling algorithm combining Maximum Information Coefficient and Transfer Entropy was used to construct the causal network among nodes in the system. Then, the abnormal nodes and the propagation path of the anomaly are deduced from the structural changes of the causal network before and after the attack. Finally, an anomaly detection algorithm based on hybrid differential cumulative is used to identify the specific anomaly data in the anomaly node. The stability of causality mining algorithm and the validity of locating causality anomalies are verified by using the data of classical chemical process. Experimental results show that the anomaly detection algorithm is better than the comparison algorithm in accuracy, false negative rate and recall rate, and the anomaly location strategy makes the anomaly source traceable.

2021-03-29
Liu, F., Wen, Y., Wu, Y., Liang, S., Jiang, X., Meng, D..  2020.  MLTracer: Malicious Logins Detection System via Graph Neural Network. 2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). :715—726.

Malicious login, especially lateral movement, has been a primary and costly threat for enterprises. However, there exist two critical challenges in the existing methods. Specifically, they heavily rely on a limited number of predefined rules and features. When the attack patterns change, security experts must manually design new ones. Besides, they cannot explore the attributes' mutual effect specific to login operations. We propose MLTracer, a graph neural network (GNN) based system for detecting such attacks. It has two core components to tackle the previous challenges. First, MLTracer adopts a novel method to differentiate crucial attributes of login operations from the rest without experts' designated features. Second, MLTracer leverages a GNN model to detect malicious logins. The model involves a convolutional neural network (CNN) to explore attributes of login operations, and a co-attention mechanism to mutually improve the representations (vectors) of login attributes through learning their login-specific relation. We implement an evaluation of such an approach. The results demonstrate that MLTracer significantly outperforms state-of-the-art methods. Moreover, MLTracer effectively detects various attack scenarios with a remarkably low false positive rate (FPR).

Ye, F..  2020.  Research and Application of Improved APRIORI Algorithm Based on Hash Technology. 2020 Asia-Pacific Conference on Image Processing, Electronics and Computers (IPEC). :64–67.
Apriori Algorithm is the most Classic Association Rule Mining Algorithm, which has unique advantages, but it also has some disadvantages such as high overhead. This paper first describes Apriori Algorithm, points out its shortcomings, introduces related concepts, and then proposes a method based on Hash technology and compressed combination item set technology to improve APRIORI algorithm. This paper introduces the basic idea and the concrete process of the improvement in detail, analyzes the efficiency of the improved algorithm by the experiment, and advances the application of the improved algorithm in the library personalized service.
Mar, Z., Oo, K. K..  2020.  An Improvement of Apriori Mining Algorithm using Linked List Based Hash Table. 2020 International Conference on Advanced Information Technologies (ICAIT). :165–169.
Today, the huge amount of data was using in organizations around the world. This huge amount of data needs to process so that we can acquire useful information. Consequently, a number of industry enterprises discovered great information from shopper purchases found in any respect times. In data mining, the most important algorithms for find frequent item sets from large database is Apriori algorithm and discover the knowledge using the association rule. Apriori algorithm was wasted times for scanning the whole database and searching the frequent item sets and inefficient of memory requirement when large numbers of transactions are in consideration. The improved Apriori algorithm is adding and calculating third threshold may increase the overhead. So, in the aims of proposed research, Improved Apriori algorithm with LinkedList and hash tabled is used to mine frequent item sets from the transaction large amount of database. This method includes database is scanning with Improved Apriori algorithm and frequent 1-item sets counts with using the hash table. Then, in the linked list saved the next frequent item sets and scanning the database. The hash table used to produce the frequent 2-item sets Therefore, the database scans the only two times and necessary less processing time and memory space.
2021-03-22
Li, Y., Zhou, W., Wang, H..  2020.  F-DPC: Fuzzy Neighborhood-Based Density Peak Algorithm. IEEE Access. 8:165963–165972.
Clustering is a concept in data mining, which divides a data set into different classes or clusters according to a specific standard, making the similarity of data objects in the same cluster as large as possible. Clustering by fast search and find of density peaks (DPC) is a novel clustering algorithm based on density. It is simple and novel, only requiring fewer parameters to achieve better clustering effect, without the requirement for iterative solution. And it has expandability and can detect the clustering of any shape. However, DPC algorithm still has some defects, such as it employs the clear neighborhood relations to calculate local density, so it cannot identify the neighborhood membership of different values of points from the distance of points and It is impossible to accurately cluster the data of the multi-density peak. The fuzzy neighborhood density peak clustering algorithm is proposed for this shortcoming (F-DPC): novel local density is defined by the fuzzy neighborhood relationship. The fuzzy set theory can be used to make the fuzzy neighborhood function of local density more sensitive, so that the clustering for data set of various shapes and densities is more robust. Experiments show that the algorithm has high accuracy and robustness.
2021-03-09
Xiao, Y., Zhang, N., Lou, W., Hou, Y. T..  2020.  Modeling the Impact of Network Connectivity on Consensus Security of Proof-of-Work Blockchain. IEEE INFOCOM 2020 - IEEE Conference on Computer Communications. :1648—1657.

Blockchain, the technology behind the popular Bitcoin, is considered a "security by design" system as it is meant to create security among a group of distrustful parties yet without a central trusted authority. The security of blockchain relies on the premise of honest-majority, namely, the blockchain system is assumed to be secure as long as the majority of consensus voting power is honest. And in the case of proof-of-work (PoW) blockchain, adversaries cannot control more than 50% of the network's gross computing power. However, this 50% threshold is based on the analysis of computing power only, with implicit and idealistic assumptions on the network and node behavior. Recent researches have alluded that factors such as network connectivity, presence of blockchain forks, and mining strategy could undermine the consensus security assured by the honest-majority, but neither concrete analysis nor quantitative evaluation is provided. In this paper we fill the gap by proposing an analytical model to assess the impact of network connectivity on the consensus security of PoW blockchain under different adversary models. We apply our analytical model to two adversarial scenarios: 1) honest-but-potentially-colluding, 2) selfish mining. For each scenario, we quantify the communication capability of nodes involved in a fork race and estimate the adversary's mining revenue and its impact on security properties of the consensus protocol. Simulation results validated our analysis. Our modeling and analysis provide a paradigm for assessing the security impact of various factors in a distributed consensus system.

Badawi, E., Jourdan, G.-V., Bochmann, G., Onut, I.-V..  2020.  An Automatic Detection and Analysis of the Bitcoin Generator Scam. 2020 IEEE European Symposium on Security and Privacy Workshops (EuroS PW). :407—416.

We investigate what we call the "Bitcoin Generator Scam" (BGS), a simple system in which the scammers promise to "generate" new bitcoins using the ones that were sent to them. A typical offer will suggest that, for a small fee, one could receive within minutes twice the amount of bitcoins submitted. BGS is clearly not a very sophisticated attack. The modus operandi is simply to put up some web page on which to find the address to send the money and wait for the payback. The pages are then indexed by search engines, and ready to find for victims looking for free bitcoins. We describe here a generic system to find and analyze scams such as BGS. We have trained a classifier to detect these pages, and we have a crawler searching for instances using a series of search engines. We then monitor the instances that we find to trace payments and bitcoin addresses that are being used over time. Unlike most bitcoin-based scam monitoring systems, we do not rely on analyzing transactions on the blockchain to find scam instances. Instead, we proactively find these instances through the web pages advertising the scam. Thus our system is able to find addresses with very few transactions, or even none at all. Indeed, over half of the addresses that have eventually received funds were detected before receiving any transactions. The data for this paper was collected over four months, from November 2019 to February 2020. We have found more than 1,300 addresses directly associated with the scam, hosted on over 500 domains. Overall, these addresses have received (at least) over 5 million USD to the scam, with an average of 47.3 USD per transaction.

2021-03-01
Kerim, A., Genc, B..  2020.  Mobile Games Success and Failure: Mining the Hidden Factors. 2020 7th International Conference on Soft Computing Machine Intelligence (ISCMI). :167–171.
Predicting the success of a mobile game is a prime issue in game industry. Thousands of games are being released each day. However, a few of them succeed while the majority fail. Towards the goal of investigating the potential correlation between the success of a mobile game and its specific attributes, this work was conducted. More than 17 thousands games were considered for that reason. We show that specific game attributes, such as number of IAPs (In-App Purchases), belonging to the puzzle genre, supporting different languages and being produced by a mature developer highly and positively affect the success of the game in the future. Moreover, we show that releasing the game in July and not including any IAPs seems to be highly associated with the game’s failure. Our second main contribution, is the proposal of a novel success score metric that reflects multiple objectives, in contrast to evaluating only revenue, average rating or rating count. We also employ different machine learning models, namely, SVM (Support Vector Machine), RF (Random Forest) and Deep Learning (DL) to predict this success score metric of a mobile game given its attributes. The trained models were able to predict this score, as well as the rating average and rating count of a mobile game with more than 70% accuracy. This prediction can help developers before releasing their game to the market to avoid any potential disappointments.
Raj, C., Khular, L., Raj, G..  2020.  Clustering Based Incident Handling For Anomaly Detection in Cloud Infrastructures. 2020 10th International Conference on Cloud Computing, Data Science Engineering (Confluence). :611–616.
Incident Handling for Cloud Infrastructures focuses on how the clustering based and non-clustering based algorithms can be implemented. Our research focuses in identifying anomalies and suspicious activities that might happen inside a Cloud Infrastructure over available datasets. A brief study has been conducted, where a network statistics dataset the NSL-KDD, has been chosen as the model to be worked upon, such that it can mirror the Cloud Infrastructure and its components. An important aspect of cloud security is to implement anomaly detection mechanisms, in order to monitor the incidents that inhibit the development and the efficiency of the cloud. Several methods have been discovered which help in achieving our present goal, some of these are highlighted as the following; by applying algorithm such as the Local Outlier Factor to cancel the noise created by irrelevant data points, by applying the DBSCAN algorithm which can detect less denser areas in order to identify their cause of clustering, the K-Means algorithm to generate positive and negative clusters to identify the anomalous clusters and by applying the Isolation Forest algorithm in order to implement decision based approach to detect anomalies. The best algorithm would help in finding and fixing the anomalies efficiently and would help us in developing an Incident Handling model for the Cloud.
2021-02-23
Park, S. H., Park, H. J., Choi, Y..  2020.  RNN-based Prediction for Network Intrusion Detection. 2020 International Conference on Artificial Intelligence in Information and Communication (ICAIIC). :572—574.
We investigate a prediction model using RNN for network intrusion detection in industrial IoT environments. For intrusion detection, we use anomaly detection methods that estimate the next packet, measure and score the distance measurement in real packets to distinguish whether it is a normal packet or an abnormal packet. When the packet was learned in the LSTM model, two-gram and sliding window of N-gram showed the best performance in terms of errors and the performance of the LSTM model was the highest compared with other data mining regression techniques. Finally, cosine similarity was used as a scoring function, and anomaly detection was performed by setting a boundary for cosine similarity that consider as normal packet.