Alkawaz, Mohammed Hazim, Joanne Steven, Stephanie, Mohammad, Omar Farook, Gapar Md Johar, Md.
2022.
Identification and Analysis of Phishing Website based on Machine Learning Methods. 2022 IEEE 12th Symposium on Computer Applications & Industrial Electronics (ISCAIE). :246–251.
People are increasingly sharing their details online as internet usage grows. Therefore, fraudsters have access to a massive amount of information and financial activities. The attackers create web pages that seem like reputable sites and transmit the malevolent content to victims to get them to provide subtle information. Prevailing phishing security measures are inadequate for detecting new phishing assaults. To accomplish this aim, objective to meet for this research is to analyses and compare phishing website and legitimate by analyzing the data collected from open-source platforms through a survey. Another objective for this research is to propose a method to detect fake sites using Decision Tree and Random Forest approaches. Microsoft Form has been utilized to carry out the survey with 30 participants. Majority of the participants have poor awareness and phishing attack and does not obverse the features of interface before accessing the search browser. With the data collection, this survey supports the purpose of identifying the best phishing website detection where Decision Tree and Random Forest were trained and tested. In achieving high number of feature importance detection and accuracy rate, the result demonstrates that Random Forest has the best performance in phishing website detection compared to Decision Tree.
Syafiq Rohmat Rose, M. Amir, Basir, Nurlida, Nabila Rafie Heng, Nur Fatin, Juana Mohd Zaizi, Nurzi, Saudi, Madihah Mohd.
2022.
Phishing Detection and Prevention using Chrome Extension. 2022 10th International Symposium on Digital Forensics and Security (ISDFS). :1–6.
During pandemic COVID-19 outbreaks, number of cyber-attacks including phishing activities have increased tremendously. Nowadays many technical solutions on phishing detection were developed, however these approaches were either unsuccessful or unable to identify phishing pages and detect malicious codes efficiently. One of the downside is due to poor detection accuracy and low adaptability to new phishing connections. Another reason behind the unsuccessful anti-phishing solutions is an arbitrary selected URL-based classification features which may produce false results to the detection. Therefore, in this work, an intelligent phishing detection and prevention model is designed. The proposed model employs a self-destruct detection algorithm in which, machine learning, especially supervised learning algorithm was used. All employed rules in algorithm will focus on URL-based web characteristic, which attackers rely upon to redirect the victims to the simulated sites. A dataset from various sources such as Phish Tank and UCI Machine Learning repository were used and the testing was conducted in a controlled lab environment. As a result, a chrome extension phishing detection were developed based on the proposed model to help in preventing phishing attacks with an appropriate countermeasure and keep users aware of phishing while visiting illegitimate websites. It is believed that this smart phishing detection and prevention model able to prevent fraud and spam websites and lessen the cyber-crime and cyber-crisis that arise from year to year.
Philomina, Josna, Fahim Fathima, K A, Gayathri, S, Elias, Glory Elizabeth, Menon, Abhinaya A.
2022.
A comparitative study of machine learning models for the detection of Phishing Websites. 2022 International Conference on Computing, Communication, Security and Intelligent Systems (IC3SIS). :1–7.
Global cybersecurity threats have grown as a result of the evolving digital transformation. Cybercriminals have more opportunities as a result of digitization. Initially, cyberthreats take the form of phishing in order to gain confidential user credentials.As cyber-attacks get more sophisticated and sophisticated, the cybersecurity industry is faced with the problem of utilising cutting-edge technology and techniques to combat the ever-present hostile threats. Hackers use phishing to persuade customers to grant them access to a company’s digital assets and networks. As technology progressed, phishing attempts became more sophisticated, necessitating the development of tools to detect phishing.Machine learning is unsupervised one of the most powerful weapons in the fight against terrorist threats. The features used for phishing detection, as well as the approaches employed with machine learning, are discussed in this study.In this light, the study’s major goal is to propose a unique, robust ensemble machine learning model architecture that gives the highest prediction accuracy with the lowest error rate, while also recommending a few alternative robust machine learning models.Finally, the Random forest algorithm attained a maximum accuracy of 96.454 percent. But by implementing a hybrid model including the 3 classifiers- Decision Trees,Random forest, Gradient boosting classifiers, the accuracy increases to 98.4 percent.
Rettlinger, Sebastian, Knaus, Bastian, Wieczorek, Florian, Ivakko, Nikolas, Hanisch, Simon, Nguyen, Giang T., Strufe, Thorsten, Fitzek, Frank H. P..
2022.
MPER - a Motion Profiling Experiment and Research system for human body movement. 2022 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops). :88–90.
State-of-the-art approaches in gait analysis usually rely on one isolated tracking system, generating insufficient data for complex use cases such as sports, rehabilitation, and MedTech. We address the opportunity to comprehensively understand human motion by a novel data model combining several motion-tracking methods. The model aggregates pose estimation by captured videos and EMG and EIT sensor data synchronously to gain insights into muscle activities. Our demonstration with biceps curl and sitting/standing pose generates time-synchronous data and delivers insights into our experiment’s usability, advantages, and challenges.
Halisdemir, Maj. Emre, Karacan, Hacer, Pihelgas, Mauno, Lepik, Toomas, Cho, Sungbaek.
2022.
Data Quality Problem in AI-Based Network Intrusion Detection Systems Studies and a Solution Proposal. 2022 14th International Conference on Cyber Conflict: Keep Moving! (CyCon). 700:367–383.
Network Intrusion Detection Systems (IDSs) have been used to increase the level of network security for many years. The main purpose of such systems is to detect and block malicious activity in the network traffic. Researchers have been improving the performance of IDS technology for decades by applying various machine-learning techniques. From the perspective of academia, obtaining a quality dataset (i.e. a sufficient amount of captured network packets that contain both malicious and normal traffic) to support machine learning approaches has always been a challenge. There are many datasets publicly available for research purposes, including NSL-KDD, KDDCUP 99, CICIDS 2017 and UNSWNB15. However, these datasets are becoming obsolete over time and may no longer be adequate or valid to model and validate IDSs against state-of-the-art attack techniques. As attack techniques are continuously evolving, datasets used to develop and test IDSs also need to be kept up to date. Proven performance of an IDS tested on old attack patterns does not necessarily mean it will perform well against new patterns. Moreover, existing datasets may lack certain data fields or attributes necessary to analyse some of the new attack techniques. In this paper, we argue that academia needs up-to-date high-quality datasets. We compare publicly available datasets and suggest a way to provide up-to-date high-quality datasets for researchers and the security industry. The proposed solution is to utilize the network traffic captured from the Locked Shields exercise, one of the world’s largest live-fire international cyber defence exercises held annually by the NATO CCDCOE. During this three-day exercise, red team members consisting of dozens of white hackers selected by the governments of over 20 participating countries attempt to infiltrate the networks of over 20 blue teams, who are tasked to defend a fictional country called Berylia. After the exercise, network packets captured from each blue team’s network are handed over to each team. However, the countries are not willing to disclose the packet capture (PCAP) files to the public since these files contain specific information that could reveal how a particular nation might react to certain types of cyberattacks. To overcome this problem, we propose to create a dedicated virtual team, capture all the traffic from this team’s network, and disclose it to the public so that academia can use it for unclassified research and studies. In this way, the organizers of Locked Shields can effectively contribute to the advancement of future artificial intelligence (AI) enabled security solutions by providing annual datasets of up-to-date attack patterns.
ISSN: 2325-5374
Nelson, Jared Ray, Shekaramiz, Mohammad.
2022.
Authorship Verification via Linear Correlation Methods of n-gram and Syntax Metrics. 2022 Intermountain Engineering, Technology and Computing (IETC). :1–6.
This research evaluates the accuracy of two methods of authorship prediction: syntactical analysis and n-gram, and explores its potential usage. The proposed algorithm measures n-gram, and counts adjectives, adverbs, verbs, nouns, punctuation, and sentence length from the training data, and normalizes each metric. The proposed algorithm compares the metrics of training samples to testing samples and predicts authorship based on the correlation they share for each metric. The severity of correlation between the testing and training data produces significant weight in the decision-making process. For example, if analysis of one metric approximates 100% positive correlation, the weight in the decision is assigned a maximum value for that metric. Conversely, a 100% negative correlation receives the minimum value. This new method of authorship validation holds promise for future innovation in fraud protection, the study of historical documents, and maintaining integrity within academia.
Samuel, Henry D, Kumar, M Santhanam, Aishwarya, R., Mathivanan, G..
2022.
Automation Detection of Malware and Stenographical Content using Machine Learning. 2022 6th International Conference on Computing Methodologies and Communication (ICCMC). :889–894.
In recent times, the occurrence of malware attacks are increasing at an unprecedented rate. Particularly, the image-based malware attacks are spreading worldwide and many people get harmful malware-based images through the technique called steganography. In the existing system, only open malware and files from the internet can be identified. However, the image-based malware cannot be identified and detected. As a result, so many phishers make use of this technique and exploit the target. Social media platforms would be totally harmful to the users. To avoid these difficulties, Machine learning can be implemented to find the steganographic malware images (contents). The proposed methodology performs an automatic detection of malware and steganographic content by using Machine Learning. Steganography is used to hide messages from apparently innocuous media (e.g., images), and steganalysis is the approach used for detecting this malware. This research work proposes a machine learning (ML) approach to perform steganalysis. In the existing system, only open malware and files from the internet are identified but in the recent times many people get harmful malware-based images through the technique called steganography. Social media platforms would be totally harmful to the users. To avoid these difficulties, the proposed Machine learning has been developed to appropriately detect the steganographic malware images (contents). Father, the steganalysis method using machine learning has been developed for performing logistic classification. By using this, the users can avoid sharing the malware images in social media platforms like WhatsApp, Facebook without downloading it. It can be also used in all the photo-sharing sites such as google photos.
Chakraborty, Joymallya, Majumder, Suvodeep, Tu, Huy.
2022.
Fair-SSL: Building fair ML Software with less data. 2022 IEEE/ACM International Workshop on Equitable Data & Technology (FairWare). :1–8.
Ethical bias in machine learning models has become a matter of concern in the software engineering community. Most of the prior software engineering works concentrated on finding ethical bias in models rather than fixing it. After finding bias, the next step is mitigation. Prior researchers mainly tried to use supervised approaches to achieve fairness. However, in the real world, getting data with trustworthy ground truth is challenging and also ground truth can contain human bias. Semi-supervised learning is a technique where, incrementally, labeled data is used to generate pseudo-labels for the rest of data (and then all that data is used for model training). In this work, we apply four popular semi-supervised techniques as pseudo-labelers to create fair classification models. Our framework, Fair-SSL, takes a very small amount (10%) of labeled data as input and generates pseudo-labels for the unlabeled data. We then synthetically generate new data points to balance the training data based on class and protected attribute as proposed by Chakraborty et al. in FSE 2021. Finally, classification model is trained on the balanced pseudo-labeled data and validated on test data. After experimenting on ten datasets and three learners, we find that Fair-SSL achieves similar performance as three state-of-the-art bias mitigation algorithms. That said, the clear advantage of Fair-SSL is that it requires only 10% of the labeled training data. To the best of our knowledge, this is the first SE work where semi-supervised techniques are used to fight against ethical bias in SE ML models. To facilitate open science and replication, all our source code and datasets are publicly available at https://github.com/joymallyac/FairSSL. CCS CONCEPTS • Software and its engineering → Software creation and management; • Computing methodologies → Machine learning. ACM Reference Format: Joymallya Chakraborty, Suvodeep Majumder, and Huy Tu. 2022. Fair-SSL: Building fair ML Software with less data. In International Workshop on Equitable Data and Technology (FairWare ‘22), May 9, 2022, Pittsburgh, PA, USA. ACM, New York, NY, USA, 8 pages. https://doi.org/10.1145/3524491.3527305
Praveen, Sivakami, Dcouth, Alysha, Mahesh, A S.
2022.
NoSQL Injection Detection Using Supervised Text Classification. 2022 2nd International Conference on Intelligent Technologies (CONIT). :1–5.
For a long time, SQL injection has been considered one of the most serious security threats. NoSQL databases are becoming increasingly popular as big data and cloud computing technologies progress. NoSQL injection attacks are designed to take advantage of applications that employ NoSQL databases. NoSQL injections can be particularly harmful because they allow unrestricted code execution. In this paper we use supervised learning and natural language processing to construct a model to detect NoSQL injections. Our model is designed to work with MongoDB, CouchDB, CassandraDB, and Couchbase queries. Our model has achieved an F1 score of 0.95 as established by 10-fold cross validation.