Visible to the public Phishing

SoS Newsletter- Advanced Book Block


Phishing remains a primary method for social engineering access to computers and information. Much research work has been done in this area in recent months. The 12 works cited here present research about detection, filtering, and profiling. The first paper was presented at HOT SoS 2014, the Symposium and Bootcamp on the Science of Security (HotSoS), a research event centered on the Science of Security held April 8-9, 2014 in Raleigh, North Carolina.

  • Rucha Tembe, Olga Zielinska, Yuqi Liu, Kyung Wha Hong, Emerson Murphy-Hill, Chris Mayhorn and Xi Ge. "Phishing in International Waters Exploring Cross-National Differences in Phishing Conceptualizations between Chinese, Indian and American Samples" HOT SoS 2014. (ID#:14-1340) Available at: This paper discusses the results of surveying one hundred-sixty four subjects from the United States, India, and China on their experiences with phishing and whether online safety practices were self-exercised. The study refuted the popular notion that there were significant similarities between these subjects in phishing attack characteristics, types of media where phishing is most prevalent, and the ramifications of phishing. Further, the study determined that age and education levied no influence on agreement between subjects on the aforementioned topics. Concluded results are discussed, such as the discovery that both Indian and Chinese participants are less likely to notice the padlocked security icon than Americans. Results from this study would be beneficial in designs for culturally-inclusive defenses against phishing. Keywords: Phishing, cultural differences, nationality, online privacy, India, China, susceptibility
  • Weibo Chu; Zhu, B.B.; Feng Xue; Xiaohong Guan; Zhongmin Cai, "Protect sensitive sites from phishing attacks using features extractable from inaccessible phishing URLs," Communications (ICC), 2013 IEEE International Conference on , vol., no., pp.1990,1994, 9-13 June 2013. (ID#:14-1342) Available at: Phishing is the third cyber-security threat globally and the first cyber-security threat in China. There were 61.69 million phishing victims in China alone from June 2011 to June 2012, with the total annual monetary loss more than 4.64 billion US dollars. These phishing attacks were highly concentrated in targeting at a few major Websites. Many phishing Webpages had a very short life span. In this paper, we assume the Websites to protect against phishing attacks are known, and study the effectiveness of machine learning based phishing detection using only lexical and domain features, which are available even when the phishing Webpages are inaccessible. We propose several novel highly effective features, and use the real phishing attack data against Taobao and Tencent, two main phishing targets in China, in studying the effectiveness of each feature, and each group of features. We then select an optimal set of features in our phishing detector, which has achieved a detection rate better than 98%, with a false positive rate of 0.64% or less. The detector is still effective when the distribution of phishing URLs changes. Keywords: Web sites; computer crime; feature extraction; learning (artificial intelligence); China; Taobao; Tencent; Web sites; cyber-security threat; domain features; inaccessible phishing URL; lexical features; machine learning based phishing detection; phishing Web pages; phishing attack data; sensitive site protection; Detectors; Electronic mail; Feature extraction; Google; Security; Superluminescent diodes; Web sites
  • DeBarr, D.; Ramanathan, V.; Wechsler, H., "Phishing detection using traffic behavior, spectral clustering, and random forests," Intelligence and Security Informatics (ISI), 2013 IEEE International Conference on , vol., no., pp.67,72, 4-7 June 2013. (ID#:14-1343) Available at: Phishing is an attempt to steal a user's identity. This is typically accomplished by sending an email message to a user, with a link directing the user to a web site used to collect personal information. Phishing detection systems typically rely on content filtering techniques, such as Latent Dirichlet Allocation (LDA), to identify phishing messages. In the case of spear phishing, however, this may be ineffective because messages from a trusted source may contain little content. In order to handle such emerging spear phishing behavior, we propose as a first step the use of Spectral Clustering to analyze messages based on traffic behavior. In particular, Spectral Clustering analyzes the links between URL substrings for web sites found in the message contents. Cluster membership is then used to construct a Random Forest classifier for phishing. Data from the Phishing Email Corpus and the Spam Assassin Email Corpus are used to evaluate this approach. Performance evaluation metrics include the Area Under the receiver operating characteristic Curve (AUC), as well as accuracy, precision, recall, and the (harmonic mean) F measure. Performance of the integrated Spectral Clustering and Random Forest approach is found to provide significant improvements in all the metrics listed, compared to a content filtering technique such as LDA coupled with text message deletion done randomly or in an adaptive fashion using adversarial learning. The Spectral Clustering approach is robust against the absence of content. In particular, we show that Spectral Clustering yields (99.8%, 97.8%) for (AUC, F measure) compared to LDA that yields (94.6%, 89.4%) and (79.6%, 57.9%) when the content of the messages is reduced to 10% of their original size using random and adversarial deletion, respectively. The difference is most striking at low False Positive (FP) rates. Keywords: Web sites; computer crime; learning (artificial intelligence);pattern classification; pattern clustering; performance evaluation; random processes; unsolicited e-mail; AUC; URL substrings; Web site; adversarial deletion; adversarial learning; area under the receiver operating characteristic curve; cluster membership; email message; false positive rates; integrated spectral clustering; message contents; performance evaluation metrics; personal information collection; phishing detection systems; phishing email corpus; random deletion; random forest classifier; spam assassin email corpus; spear phishing behavior; text message deletion; traffic behavior; trusted source; Electronic mail; Laplace equations; Training; Vegetation; Web servers; Web sites; Latent Dirichlet Allocation; Link Analysis; Phishing; Spear Phishing; Spectral Clustering
  • Hamid, I.R.A.; Abawajy, J.H., "Profiling Phishing Email Based on Clustering Approach," Trust, Security and Privacy in Computing and Communications (TrustCom), 2013 12th IEEE International Conference on , vol., no., pp.628,635, 16-18 July 2013. (ID#:14-1344) Available at: In this paper, an approach for profiling email-born phishing activities is proposed. Profiling phishing activities are useful in determining the activity of an individual or a particular group of phishers. By generating profiles, phishing activities can be well understood and observed. Typically, work in the area of phishing is intended at detection of phishing emails, whereas we concentrate on profiling the phishing email. We formulate the profiling problem as a clustering problem using the various features in the phishing emails as feature vectors. Further, we generate profiles based on clustering predictions. These predictions are further utilized to generate complete profiles of these emails. The performance of the clustering algorithms at the earlier stage is crucial for the effectiveness of this model. We carried out an experimental evaluation to determine the performance of many classification algorithms by incorporating clustering approach in our model. Our proposed profiling email-born phishing algorithm (ProEP) demonstrates promising results with the RatioSize rules for selecting the optimal number of clusters. Keywords: electronic mail; pattern classification; pattern clustering; program diagnostics; unsolicited e-mail; ProEP algorithm; RatioSize rules; classification algorithms; clustering approach; clustering predictions; e-mail-borne phishing activity profiling; feature vectors; optimal cluster number selection; performance evaluation; phishing email detection; profiling e-mail-born phishing algorithm; Classification algorithms; Clustering algorithms; Computational modeling; Data models; Electronic mail; Feature extraction; Prediction algorithms; Clustering Algorithm; Phishing; Profiling
  • Shian-Shyong Tseng, Ching-Heng Ku, Ai-Chin Lu, Yuh-Jye Wang, Guang-Gang Geng. "Building a Self-Organizing Phishing Model Based upon Dynamic EMCUD" IIH-MSP '13 Proceedings of the 2013 Ninth International Conference on Intelligent Information Hiding and Multimedia Signal Processing October 2013. (Pages 509-512) (ID#:14-1348) Available at: or In recent years, with the rapid growth of the Internet applications and services, phishing attacks seriously threaten the web security. Due to the versatile and dynamic nature of phishing patterns, the development and maintenance of the anti-phishing prevention system is difficult and costly. Hence, how to acquire and update the phishing knowledge and the phishing model in the anti-phishing detection system become an important issue. In this study, we use the EMCUD (Extended Embedded Meaning Capturing and Uncertainty Deciding) method to build up the phishing attack knowledge according to the identification of phishing attributes. Since users have been aware of some anti-phishing methods, phishers often evolve phishing attack to gain in the environment. The phishing attack knowledge also needs to be dynamically evolved over time. How to systematically evolve the phishing knowledge becomes a major concern of this study. Hence, we use the VODKA (Variant Objects Discovering Knowledge Acquisition) method, a dynamic EMCUD, to evolve existing phishing knowledge. These methods can facilitate the acquisition of new inference rules for the phishing attack knowledge and the observation of the variation and the trend of the phishing attack. In the experiment, 1, 762 phishing URL of the APNOW (Anti-Phishing Notification Window) phishing database of Taiwan have been partitioned into 7 representative phishing cases, and 10 phishing attributes have been obtained by the VOKDA method. Finally, we successfully evolve detection rules of phishing models and observe the trend of the phishing attack model to show the feasibility of this study. Keywords: (not available)
  • M. Pandey, V. Ravi . "Phishing Detection Using PSOAANN Based One-Class Classifier" ICETET '13 Proceedings of the 2013 6th International Conference on Emerging Trends in Engineering and Technology September 2013 (Pages 148-153) (ID#:14-1349) Available at: or We propose to detect phishing emails and websites using particle swarm optimization (PSO) trained auto associative neural network (PSOAANN), which is employed as one class classifier. PSOAANN achieved better results when compared to previous efforts. In the study, we also developed a new feature selection method based on the weights from input to hidden layers of the PSOAANN. We compared its performance with other methods. Keywords: (not available)
  • Bastian Braun, Martin Johns, Johannes Koestler, Joachim Posegga. " PhishSafe: leveraging modern JavaScript API's for transparent and robust protection" Proceedings of the 4th ACM conference on Data and application security and privacy MRCH 2014. (Pages 61-72) (ID#:14-1350) Available at: or The term "phishing" describes a class of social engineering attacks on authentication systems, that aim to steal the victim's authentication credential, e.g., the username and password. The severity of phishing is recognized since the mid-1990's and a considerable amount of attention has been devoted to the topic. However, currently deployed or proposed countermeasures are either incomplete, cumbersome for the user, or incompatible with standard browser technology. In this paper, we show how modern JavaScript API's can be utilized to build PhishSafe, a robust authentication scheme, that is immune against phishing attacks, easily deployable using the current browser generation, and requires little change in the end-user's interaction with the application. We evaluate the implementation and find that it is applicable to web applications with low efforts and causes no tangible overhead. Keywords: (not available)
  • Philippe De Ryck, Nick Nikiforakis, Lieven Desmet, Wouter Joosen. "TabShots: client-side detection of tabnabbing attacks" Proceedings of the 8th ACM SIGSAC symposium on Information, computer and communications security May 2013 (Pages 447-456) (ID#:14-1351) Available at: or As the web grows larger and larger and as the browser becomes the vehicle-of-choice for delivering many applications of daily use, the security and privacy of web users is under constant attack. Phishing is as prevalent as ever, with anti-phishing communities reporting thousands of new phishing campaigns each month. In 2010, tabnabbing, a variation of phishing, was introduced. In a tabnabbing attack, an innocuous-looking page, opened in a browser tab, disguises itself as the login page of a popular web application, when the user's focus is on a different tab. The attack exploits the trust of users for already opened pages and the user habit of long-lived browser tabs. To combat this recent attack, we propose TabShots. TabShots is a browser extension that helps browsers and users to remember what each tab looked like, before the user changed tabs. Our system compares the appearance of each tab and highlights the parts that were changed, allowing the user to distinguish between legitimate changes and malicious masquerading. Using an experimental evaluation on the most popular sites of the Internet, we show that TabShots has no impact on 78% of these sites, and very little on another 19%. Thereby, TabShots effectively protects users against tabnabbing attacks without affecting their browsing habits and without breaking legitimate popular sites. Keywords: management of computing and information systems; Security and Protection; Invasive software (e.g., viruses, worms, Trojan horses); information storage and retrieval ; On-line Information Services; Computing Milieux; Authentication
  • Le Xu, Li Li, Vijayakrishnan Nagarajan, Dijiang Huang, Wei-Tek Tsai. "Secure Web Referral Services for Mobile Cloud Computing" SOSE '13 Proceedings of the 2013 IEEE Seventh International Symposium on Service-Oriented System Engineering March 2013 (Pages 584-593) (ID#:14-1352) Available at: or Security has become a major concern for mobile devices when mobile users browsing malicious websites. Existed security solutions may rely on human factors to achieve a good result against phishing websites and SSL Strip-based Man-In-The-Middle (MITM) attack. This paper presents a secure web referral service, which is called Secure Search Engine (SSE) for mobile devices. The system uses mobile cloud-based virtual computing and provides each user a Virtual Machine (VM) as a personal security proxy where all Web traffics are redirected through it. Within the VM, the SSE uses web crawling technology with a set of checking services to validate IP addresses and certificate chains. A Phishing Filter is also used to check given URLs with an optimized execution time. The system also uses private and anonymously shared caches to protect user privacy and improve performance. The evaluation results show that SSE is non-intrusive and consumes no power or computation on the client device, while producing less false positive and false negative than existing web browser-based anti-phishing solutions. Keywords: (not available)
  • Mark Scanlon, M-Tahar Kechadi. "Universal Peer-to-Peer Network Investigation Framework" ARES '13 Proceedings of the 2013 International Conference on Availability, Reliability and Security September 2013 (Pages 694-700). (ID#:14-1353) Available at: or Peer-to-Peer (P2P) networking has fast become a useful technological advancement for a vast range of cyber criminal activities. Cyber crimes from copyright infringement and spamming, to serious, high financial impact crimes, such as fraud, distributed denial of service attacks (DDoS) and phishing can all be aided by applications and systems based on the technology. The requirement for investigating P2P based systems is not limited to the more well-known cyber crimes listed above, as many more legitimate P2P based applications may also be pertinent to a digital forensic investigation, e.g., VoIP and instant messaging communications, etc. Investigating these networks has become increasingly difficult due to the broad range of network topologies and the ever increasing and evolving range of P2P based applications. This paper introduces the Universal Peer-to-Peer Network Investigation Framework (UP2PNIF), a framework which enables significantly faster and less labor intensive investigation of newly discovered P2P networks through the exploitation of the commonalities in network functionality. In combination with a reference database of known network protocols and characteristics, it is envisioned that any known P2P network can be instantly investigated using the framework. The framework can intelligently determine the best methodology dependent on the focus of the investigation resulting in a significantly expedited evidence gathering process. Keywords: (not available)
  • Chen, Zhen; Han, Fuye; Cao, Junwei; Jiang, Xin; Chen, Shuo, "Cloud computing-based forensic analysis for collaborative network security management system," Tsinghua Science and Technology , vol.18, no.1, pp.40,50, Feb. 2013. (ID#:14-1354) Available at: Internet security problems remain a major challenge with many security concerns such as Internet worms, spam, and phishing attacks. Botnets, well-organized distributed network attacks, consist of a large number of bots that generate huge volumes of spam or launch Distributed Denial of Service (DDoS) attacks on victim hosts. New emerging botnet attacks degrade the status of Internet security further. To address these problems, a practical collaborative network security management system is proposed with an effective collaborative Unified Threat Management (UTM) and traffic probers. A distributed security overlay network with a centralized security center leverages a peer-to-peer communication protocol used in the UTMs collaborative module and connects them virtually to exchange network events and security rules. Security functions for the UTM are retrofitted to share security rules. In this paper, we propose a design and implementation of a cloud-based security center for network security forensic analysis. We propose using cloud storage to keep collected traffic data and then processing it with cloud computing platforms to find the malicious attacks. As a practical example, phishing attack forensic analysis is presented and the required computing and storage resources are evaluated based on real trace data. The cloud-based security center can instruct each collaborative UTM and prober to collect events and raw traffic, send them back for deep analysis, and generate new security rules. These new security rules are enforced by collaborative UTM and the feedback events of such rules are returned to the security center. By this type of close-loop control, the collaborative network security management system can identify and address new distributed attacks more quickly and effectively. Keywords: Cloud computing; Collaboration; Collaborative work; Computer crime; Computer security; Digital forensics; Forensics; Network security; Web and internet services; amazon web service; anti-botnet; anti-phishing; cloud computing; collaborative network security system; computer forensics; eucalyptus; hadoop file system; overlay network
  • Min-Sheng Lin; Chien-Yi Chiu; Yuh-Jye Lee; Hsing-Kuo Pao, "Malicious URL filtering -- A big data application," Big Data, 2013 IEEE International Conference on , vol., no., pp.589,596, 6-9 Oct. 2013. (ID#:14-1355) Available at: Malicious URLs have become a channel for Internet criminal activities such as drive-by-download, spamming and phishing. Applications for the detection of malicious URLs are accurate but slow (because they need to download the content or query some Internet host information). In this paper we present a novel lightweight filter based only on the URL string itself to use before existing processing methods. We run experiments on a large dataset and demonstrate a 75% reduction in workload size while retaining at least 90% of malicious URLs. Existing methods do not scale well with the hundreds of millions of URLs encountered every day as the problem is a heavily-imbalanced, large-scale binary classification problem. Our proposed method is able to handle nearly two million URLs in less than five minutes. We generate two filtering models by using lexical features and descriptive features, and then combine the filtering results. The on-line learning algorithms are applied here not only for dealing with large-scale data sets but also for fitting the very short lifetime characteristics of malicious URLs. Our filter can significantly reduce the volume of URL queries on which further analysis needs to be performed, saving both computing time and bandwidth used for content retrieval. Keywords: Internet; computer crime; learning (artificial intelligence);pattern classification; query processing; Internet criminal activities; URL queries; URL string; big data application; content retrieval; drive-by-download ;heavily-imbalanced large-scale binary classification problem; lifetime characteristics; lightweight filter; malicious URL filtering; on-line learning algorithms ;phishing; spamming; Dictionaries; Feature extraction; IP networks; Prediction algorithms; Predictive models; Training; Web sites; Data Mining; Information Filtering; Information Security; Machine learning
  • Smitha, A.; Manohara Pai, M.M.; Ajam, N.; Mouzna, J., "An optimized adaptive algorithm for authentication of safety critical messages in VANET," Communications and Networking in China (CHINACOM), 2013 8th International ICST Conference on , vol., no., pp.149,154, 14-16 Aug. 2013. (ID#:14-1356) Available at: Authentication is one of the essential frameworks to ensure safe and secure message dissemination in Vehicular Adhoc Networks (VANETs). But an optimized authentication algorithm with reduced computational overhead is still a challenge. In this paper, we propose a novel classification of safety critical messages and provide an adaptive algorithm for authentication in VANETs using the concept of Merkle tree and Elliptic Curve Digital Signature Algorithm (ECDSA). Here, the Merkle tree is constructed to store the hashed values of public keys at the leaf nodes. This algorithm addresses Denial of Service (DoS) attack, man in the middle attack and phishing attack. Experimental results show that, the algorithm reduces the computational delay by 20 percent compared to existing schemes. Keywords: digital signatures; information dissemination; pattern classification; public key cryptography; telecommunication security; tree data structures; vehicular ad hoc networks; ECDSA; Merkle tree; VANET; computational delay reduction; computational overhead reduction; denial-of-service attack; elliptic curve digital signature algorithm; leaf nodes; man-in-the-middle attack; optimized adaptive algorithm; phishing attack; public keys; safe message dissemination; safety critical message authentication; safety critical message classification; secure message dissemination; vehicular adhoc networks; Authentication; Computer crime; Public key; Receivers; Safety; Vehicles; Vehicular ad hoc networks; DoS attack; ECDSA; Entity Authentication; Merkle tree; Non repudiation
  • Alarifi, A.; Alsaleh, M.; Al-Salman, A.-M., "Security analysis of top visited Arabic Web sites," Advanced Communication Technology (ICACT), 2013 15th International Conference on , vol., no., pp.173,178, 27-30 Jan. 2013. (ID#:14-1358) Available at: The richness and effectiveness of client-side vulnerabilities contributed to an accelerated shift toward client-side Web attacks. In order to understand the volume and nature of such malicious Web pages, we perform a detailed analysis of a subset of top visited Web sites using Google Trends. Our study is limited to the Arabic content in the Web and thus only the top Arabic searching terms are considered. To carry out this study, we analyze more than 7,000 distinct domain names by traversing all the visible pages within each domain. To identify different types of suspected phishing and malware pages, we use the API of Sucuri SiteCheck, McAfee SiteAdvisor, Google Safe Browsing, Norton, and AVG website scanners. The study shows the existence of malicious contents across a variety of types of Web pages. The results indicate that a significant number of these sites carry some known malware, are in a blacklisting status, or have some out-of-date software. Throughout our analysis, we characterize the impact of the detected malware families and speculate as to how the reported positive Web servers got infected. Keywords: web sites; client-server systems ;information retrieval; invasive software; API; AVG Website scanner; Arabic content; Arabic searching terms; Google Safe Browsing; Google Trends; McAfee SiteAdvisor; Norton; Sucuri SiteCheck; blacklisting status; client-side Web attack; client-side vulnerability; distinct domain name; infection; malicious Web page; malicious content; malware family detection; malware page; out-of-date software; positive Web server; security analysis; suspected phishing page ;t op visited Arabic Web sites; visible page; Malicious links; Malware ;Search engine spam; Web spam; Web vulnerabilities


Articles listed on these pages have been found on publicly available internet pages and are cited with links to those pages. Some of the information included herein has been reprinted with permission from the authors or data repositories. Direct any requests via Email to SoS.Project (at) for removal of the links or modifications to specific citations. Please include the ID# of the specific citation in your correspondence.