Visible to the public Biblio

Filters: Keyword is phishing  [Clear All Filters]
2020-05-18
Thejaswini, S, Indupriya, C.  2019.  Big Data Security Issues and Natural Language Processing. 2019 3rd International Conference on Trends in Electronics and Informatics (ICOEI). :1307–1312.
Whenever we talk about big data, the concern is always about the security of the data. In recent days the most heard about technology is the Natural Language Processing. This new and trending technology helps in solving the ever ending security problems which are not completely solved using big data. Starting with the big data security issues, this paper deals with addressing the topics related to cyber security and information security using the Natural Language Processing technology. Including the well-known cyber-attacks such as phishing identification and spam detection, this paper also addresses issues on information assurance and security such as detection of Advanced Persistent Threat (APT) in DNS and vulnerability analysis. The goal of this paper is to provide the overview of how natural language processing can be used to address cyber security issues.
Peng, Tianrui, Harris, Ian, Sawa, Yuki.  2018.  Detecting Phishing Attacks Using Natural Language Processing and Machine Learning. 2018 IEEE 12th International Conference on Semantic Computing (ICSC). :300–301.
Phishing attacks are one of the most common and least defended security threats today. We present an approach which uses natural language processing techniques to analyze text and detect inappropriate statements which are indicative of phishing attacks. Our approach is novel compared to previous work because it focuses on the natural language text contained in the attack, performing semantic analysis of the text to detect malicious intent. To demonstrate the effectiveness of our approach, we have evaluated it using a large benchmark set of phishing emails.
2020-04-17
Oest, Adam, Safaei, Yeganeh, Doupé, Adam, Ahn, Gail-Joon, Wardman, Brad, Tyers, Kevin.  2019.  PhishFarm: A Scalable Framework for Measuring the Effectiveness of Evasion Techniques against Browser Phishing Blacklists. 2019 IEEE Symposium on Security and Privacy (SP). :1344—1361.
Phishing attacks have reached record volumes in recent years. Simultaneously, modern phishing websites are growing in sophistication by employing diverse cloaking techniques to avoid detection by security infrastructure. In this paper, we present PhishFarm: a scalable framework for methodically testing the resilience of anti-phishing entities and browser blacklists to attackers' evasion efforts. We use PhishFarm to deploy 2,380 live phishing sites (on new, unique, and previously-unseen .com domains) each using one of six different HTTP request filters based on real phishing kits. We reported subsets of these sites to 10 distinct anti-phishing entities and measured both the occurrence and timeliness of native blacklisting in major web browsers to gauge the effectiveness of protection ultimately extended to victim users and organizations. Our experiments revealed shortcomings in current infrastructure, which allows some phishing sites to go unnoticed by the security community while remaining accessible to victims. We found that simple cloaking techniques representative of real-world attacks- including those based on geolocation, device type, or JavaScript- were effective in reducing the likelihood of blacklisting by over 55% on average. We also discovered that blacklisting did not function as intended in popular mobile browsers (Chrome, Safari, and Firefox), which left users of these browsers particularly vulnerable to phishing attacks. Following disclosure of our findings, anti-phishing entities are now better able to detect and mitigate several cloaking techniques (including those that target mobile users), and blacklisting has also become more consistent between desktop and mobile platforms- but work remains to be done by anti-phishing entities to ensure users are adequately protected. Our PhishFarm framework is designed for continuous monitoring of the ecosystem and can be extended to test future state-of-the-art evasion techniques used by malicious websites.
2020-04-10
Kikuchi, Masato, Okubo, Takao.  2019.  Power of Communication Behind Extreme Cybersecurity Incidents. 2019 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech). :315—319.
There are increasing threats for cyberspace. This paper tries to identify how extreme cybersecurity incidents occur based on the scenario of a targeted attack through emails. Knowledge on how extreme cybersecurity incidents occur helps in identifying the key points on how they can be prevented from occurring. The model based on system thinking approach to the understanding how communication influences entities and how tiny initiating events scale up into extreme events provides a condensed figure of the cyberspace and surrounding threats. By taking cyberspace layers and characteristics of cyberspace identified by this model into consideration, it predicts most suitable risk mitigations.
Robic-Butez, Pierrick, Win, Thu Yein.  2019.  Detection of Phishing websites using Generative Adversarial Network. 2019 IEEE International Conference on Big Data (Big Data). :3216—3221.
Phishing is typically deployed as an attack vector in the initial stages of a hacking endeavour. Due to it low-risk rightreward nature it has seen a widespread adoption, and detecting it has become a challenge in recent times. This paper proposes a novel means of detecting phishing websites using a Generative Adversarial Network. Taking into account the internal structure and external metadata of a website, the proposed approach uses a generator network which generates both legitimate as well as synthetic phishing features to train a discriminator network. The latter then determines if the features are either normal or phishing websites, before improving its detection accuracy based on the classification error. The proposed approach is evaluated using two different phishing datasets and is found to achieve a detection accuracy of up to 94%.
Huang, Yongjie, Qin, Jinghui, Wen, Wushao.  2019.  Phishing URL Detection Via Capsule-Based Neural Network. 2019 IEEE 13th International Conference on Anti-counterfeiting, Security, and Identification (ASID). :22—26.
As a cyber attack which leverages social engineering and other sophisticated techniques to steal sensitive information from users, phishing attack has been a critical threat to cyber security for a long time. Although researchers have proposed lots of countermeasures, phishing criminals figure out circumventions eventually since such countermeasures require substantial manual feature engineering and can not detect newly emerging phishing attacks well enough, which makes developing an efficient and effective phishing detection method an urgent need. In this work, we propose a novel phishing website detection approach by detecting the Uniform Resource Locator (URL) of a website, which is proved to be an effective and efficient detection approach. To be specific, our novel capsule-based neural network mainly includes several parallel branches wherein one convolutional layer extracts shallow features from URLs and the subsequent two capsule layers generate accurate feature representations of URLs from the shallow features and discriminate the legitimacy of URLs. The final output of our approach is obtained by averaging the outputs of all branches. Extensive experiments on a validated dataset collected from the Internet demonstrate that our approach can achieve competitive performance against other state-of-the-art detection methods while maintaining a tolerable time overhead.
Yadollahi, Mohammad Mehdi, Shoeleh, Farzaneh, Serkani, Elham, Madani, Afsaneh, Gharaee, Hossein.  2019.  An Adaptive Machine Learning Based Approach for Phishing Detection Using Hybrid Features. 2019 5th International Conference on Web Research (ICWR). :281—286.
Nowadays, phishing is one of the most usual web threats with regards to the significant growth of the World Wide Web in volume over time. Phishing attackers always use new (zero-day) and sophisticated techniques to deceive online customers. Hence, it is necessary that the anti-phishing system be real-time and fast and also leverages from an intelligent phishing detection solution. Here, we develop a reliable detection system which can adaptively match the changing environment and phishing websites. Our method is an online and feature-rich machine learning technique to discriminate the phishing and legitimate websites. Since the proposed approach extracts different types of discriminative features from URLs and webpages source code, it is an entirely client-side solution and does not require any service from the third-party. The experimental results highlight the robustness and competitiveness of our anti-phishing system to distinguish the phishing and legitimate websites.
Bagui, Sikha, Nandi, Debarghya, Bagui, Subhash, White, Robert Jamie.  2019.  Classifying Phishing Email Using Machine Learning and Deep Learning. 2019 International Conference on Cyber Security and Protection of Digital Services (Cyber Security). :1—2.
In this work, we applied deep semantic analysis, and machine learning and deep learning techniques, to capture inherent characteristics of email text, and classify emails as phishing or non -phishing.
Ikhsan, Mukhammad Gufron, Ramli, Kalamullah.  2019.  Measuring the Information Security Awareness Level of Government Employees Through Phishing Assessment. 2019 34th International Technical Conference on Circuits/Systems, Computers and Communications (ITC-CSCC). :1—4.
As an important institutional element, government information security is not only related to technical issues but also to human resources. Various types of information security instruments in an institution cannot provide maximum protection as long as employees still have a low level of information security awareness. This study aims to measure the level of information security awareness of government employees through case studies at the Directorate General of ABC (DG ABC) in Indonesia. This study used two methods, behavior approach through phishing simulation and knowledge approach through a questionnaire on a Likert scale. The simulation results were analyzed on a percentage scale and compared to the results of the questionnaire to determine the level of employees' information security awareness and determine which method was the best. Results show a significant relationship between the simulation results and the questionnaire results. Among the employees who opened the email, 69% clicked on the link that led to the camouflage page and through the questionnaire, it was found that the information security awareness level of DG ABC employees was at the level of 79.32% which was the lower limit of the GOOD category.
Baral, Gitanjali, Arachchilage, Nalin Asanka Gamagedara.  2019.  Building Confidence not to be Phished Through a Gamified Approach: Conceptualising User's Self-Efficacy in Phishing Threat Avoidance Behaviour. 2019 Cybersecurity and Cyberforensics Conference (CCC). :102—110.
Phishing attacks are prevalent and humans are central to this online identity theft attack, which aims to steal victims' sensitive and personal information such as username, password, and online banking details. There are many antiphishing tools developed to thwart against phishing attacks. Since humans are the weakest link in phishing, it is important to educate them to detect and avoid phishing attacks. One can argue self-efficacy is one of the most important determinants of individual's motivation in phishing threat avoidance behaviour, which has co-relation with knowledge. The proposed research endeavours on the user's self-efficacy in order to enhance the individual's phishing threat avoidance behaviour through their motivation. Using social cognitive theory, we explored that various knowledge attributes such as observational (vicarious) knowledge, heuristic knowledge and structural knowledge contributes immensely towards the individual's self-efficacy to enhance phishing threat prevention behaviour. A theoretical framework is then developed depicting the mechanism that links knowledge attributes, self-efficacy, threat avoidance motivation that leads to users' threat avoidance behaviour. Finally, a gaming prototype is designed incorporating the knowledge elements identified in this research that aimed to enhance individual's self-efficacy in phishing threat avoidance behaviour.
[Anonymous].  2019.  PhishFarm: A Scalable Framework for Measuring the Effectiveness of Evasion Techniques against Browser Phishing Blacklists - IEEE Conference Publication.

Phishing attacks have reached record volumes in recent years. Simultaneously, modern phishing websites are growing in sophistication by employing diverse cloaking techniques to avoid detection by security infrastructure. In this paper, we present PhishFarm: a scalable framework for methodically testing the resilience of anti-phishing entities and browser blacklists to attackers' evasion efforts. We use PhishFarm to deploy 2,380 live phishing sites (on new, unique, and previously-unseen .com domains) each using one of six different HTTP request filters based on real phishing kits. We reported subsets of these sites to 10 distinct anti-phishing entities and measured both the occurrence and timeliness of native blacklisting in major web browsers to gauge the effectiveness of protection ultimately extended to victim users and organizations. Our experiments revealed shortcomings in current infrastructure, which allows some phishing sites to go unnoticed by the security community while remaining accessible to victims. We found that simple cloaking techniques representative of real-world attacks- including those based on geolocation, device type, or JavaScript- were effective in reducing the likelihood of blacklisting by over 55% on average. We also discovered that blacklisting did not function as intended in popular mobile browsers (Chrome, Safari, and Firefox), which left users of these browsers particularly vulnerable to phishing attacks. Following disclosure of our findings, anti-phishing entities are now better able to detect and mitigate several cloaking techniques (including those that target mobile users), and blacklisting has also become more consistent between desktop and mobile platforms- but work remains to be done by anti-phishing entities to ensure users are adequately protected. Our PhishFarm framework is designed for continuous monitoring of the ecosystem and can be extended to test future state-of-the-art evasion techniques used by malicious websites.

Chapla, Happy, Kotak, Riddhi, Joiser, Mittal.  2019.  A Machine Learning Approach for URL Based Web Phishing Using Fuzzy Logic as Classifier. 2019 International Conference on Communication and Electronics Systems (ICCES). :383—388.
Phishing is the major problem of the internet era. In this era of internet the security of our data in web is gaining an increasing importance. Phishing is one of the most harmful ways to unknowingly access the credential information like username, password or account number from the users. Users are not aware of this type of attack and later they will also become a part of the phishing attacks. It may be the losses of financial found, personal information, reputation of brand name or trust of brand. So the detection of phishing site is necessary. In this paper we design a framework of phishing detection using URL.
2020-03-09
Nathezhtha, T., Sangeetha, D., Vaidehi, V..  2019.  WC-PAD: Web Crawling based Phishing Attack Detection. 2019 International Carnahan Conference on Security Technology (ICCST). :1–6.
Phishing is a criminal offense which involves theft of user's sensitive data. The phishing websites target individuals, organizations, the cloud storage hosting sites and government websites. Currently, hardware based approaches for anti-phishing is widely used but due to the cost and operational factors software based approaches are preferred. The existing phishing detection approaches fails to provide solution to problem like zero-day phishing website attacks. To overcome these issues and precisely detect phishing occurrence a three phase attack detection named as Web Crawler based Phishing Attack Detector(WC-PAD) has been proposed. It takes the web traffics, web content and Uniform Resource Locator(URL) as input features, based on these features classification of phishing and non phishing websites are done. The experimental analysis of the proposed WC-PAD is done with datasets collected from real phishing cases. From the experimental results, it is found that the proposed WC-PAD gives 98.9% accuracy in both phishing and zero-day phishing attack detection.
2020-03-02
Dauterman, Emma, Corrigan-Gibbs, Henry, Mazières, David, Boneh, Dan, Rizzo, Dominic.  2019.  True2F: Backdoor-Resistant Authentication Tokens. 2019 IEEE Symposium on Security and Privacy (SP). :398–416.
We present True2F, a system for second-factor authentication that provides the benefits of conventional authentication tokens in the face of phishing and software compromise, while also providing strong protection against token faults and backdoors. To do so, we develop new lightweight two-party protocols for generating cryptographic keys and ECDSA signatures, and we implement new privacy defenses to prevent cross-origin token-fingerprinting attacks. To facilitate real-world deployment, our system is backwards-compatible with today's U2F-enabled web services and runs on commodity hardware tokens after a firmware modification. A True2F-protected authentication takes just 57ms to complete on the token, compared with 23ms for unprotected U2F.
Babkin, Sergey, Epishkina, Anna.  2019.  Authentication Protocols Based on One-Time Passwords. 2019 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus). :1794–1798.
Nowadays one-time passwords are used in a lot of areas of information technologies including e-commerce. A few vulnerabilities in authentication protocols based on one-time passwords are widely known. In current work, we analyze authentication protocols based on one-time passwords and their vulnerabilities. Both simple and complicated protocols which are implementing cryptographic algorithms are reviewed. For example, an analysis of relatively old Lamport's hash-chain protocol is provided. At the same time, we examine HOTP and TOTP protocols which are actively used nowadays. The main result of the work are conclusions about the security of reviewed protocols based on one-time passwords.
2020-02-26
Bikov, T. D., Iliev, T. B., Mihaylov, Gr. Y., Stoyanov, I. S..  2019.  Phishing in Depth – Modern Methods of Detection and Risk Mitigation. 2019 42nd International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO). :447–450.

Nowadays, everyone is living in a digital world with various of virtual experiences and realities, but all of them may eventually cause real threats in our real world. Some of these threats have been born together with the first electronic mail service. Some of them might be considered as really basic and simple, compared to others that were developed and advanced in time to adapt themselves for the security defense mechanisms of the modern digital world. On a daily basis, more than 238.4 billion emails are sent worldwide, which makes more than 2.7 million emails per second, and these statistics are only from the publicly visible networks. Having that information and considering around 60% and above of all emails as threatening or not legitimate, is more than concerning. Unfortunately, even the modern security measures and systems are not capable to identify and prevent all the fraudulent content that is created and distributed every day. In this paper we will cover the most common attack vectors, involving the already mass email infrastructures, the required contra measures to minimize the impact over the corporate environments and what else should be developed to mitigate the modern sophisticated email attacks.

2020-02-17
Legg, Phil, Blackman, Tim.  2019.  Tools and Techniques for Improving Cyber Situational Awareness of Targeted Phishing Attacks. 2019 International Conference on Cyber Situational Awareness, Data Analytics And Assessment (Cyber SA). :1–4.
Phishing attacks continue to be one of the most common attack vectors used online today to deceive users, such that attackers can obtain unauthorised access or steal sensitive information. Phishing campaigns often vary in their level of sophistication, from mass distribution of generic content, such as delivery notifications, online purchase orders, and claims of winning the lottery, through to bespoke and highly-personalised messages that convincingly impersonate genuine communications (e.g., spearphishing attacks). There is a distinct trade-off here between the scale of an attack versus the effort required to curate content that is likely to convince an individual to carry out an action (typically, clicking a malicious hyperlink). In this short paper, we conduct a preliminary study on a recent realworld incident that strikes a balance between attacking at scale and personalised content. We adopt different visualisation tools and techniques for better assessing the scale and impact of the attack, that can be used both by security professionals to analyse the security incident, but could also be used to inform employees as a form of security awareness and training. We pitched the approach to IT professionals working in information security, who believe this may provide improved awareness of how targeted phishing campaigns can impact an organisation, and could contribute towards a pro-active step of how analysts will examine and mitigate the impact of future attacks across the organisation.
2020-02-10
Eshmawi, Ala', Nair, Suku.  2019.  The Roving Proxy Framewrok for SMS Spam and Phishing Detection. 2019 2nd International Conference on Computer Applications Information Security (ICCAIS). :1–6.

This paper presents the details of the roving proxy framework for SMS spam and SMS phishing (SMishing) detection. The framework aims to protect organizations and enterprises from the danger of SMishing attacks. Feasibility and functionality studies of the framework are presented along with an update process study to define the minimum requirements for the system to adapt with the latest spam and SMishing trends.

2020-01-20
Huang, Yongjie, Yang, Qiping, Qin, Jinghui, Wen, Wushao.  2019.  Phishing URL Detection via CNN and Attention-Based Hierarchical RNN. 2019 18th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/13th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). :112–119.
Phishing websites have long been a serious threat to cyber security. For decades, many researchers have been devoted to developing novel techniques to detect phishing websites automatically. While state-of-the-art solutions can achieve superior performances, they require substantial manual feature engineering and are not adept at detecting newly emerging phishing attacks. Therefore, developing techniques that can detect phishing websites automatically and handle zero-day phishing attacks swiftly is still an open challenge in this area. In this work, we propose PhishingNet, a deep learning-based approach for timely detection of phishing Uniform Resource Locators (URLs). Specifically, we use a Convolutional Neural Network (CNN) module to extract character-level spatial feature representations of URLs; meanwhile, we employ an attention-based hierarchical Recurrent Neural Network(RNN) module to extract word-level temporal feature representations of URLs. We then fuse these feature representations via a three-layer CNN to build accurate feature representations of URLs, on which we train a phishing URL classifier. Extensive experiments on a verified dataset collected from the Internet demonstrate that the feature representations extracted automatically are conducive to the improvement of the generalization ability of our approach on newly emerging URLs, which makes our approach achieve competitive performance against other state-of-the-art approaches.
2019-11-26
Tian, Ke, Jan, Steve T. K., Hu, Hang, Yao, Danfeng, Wang, Gang.  2018.  Needle in a Haystack: Tracking Down Elite Phishing Domains in the Wild. Proceedings of the Internet Measurement Conference 2018. :429-442.

Today's phishing websites are constantly evolving to deceive users and evade the detection. In this paper, we perform a measurement study on squatting phishing domains where the websites impersonate trusted entities not only at the page content level but also at the web domain level. To search for squatting phishing pages, we scanned five types of squatting domains over 224 million DNS records and identified 657K domains that are likely impersonating 702 popular brands. Then we build a novel machine learning classifier to detect phishing pages from both the web and mobile pages under the squatting domains. A key novelty is that our classifier is built on a careful measurement of evasive behaviors of phishing pages in practice. We introduce new features from visual analysis and optical character recognition (OCR) to overcome the heavy content obfuscation from attackers. In total, we discovered and verified 1,175 squatting phishing pages. We show that these phishing pages are used for various targeted scams, and are highly effective to evade detection. More than 90% of them successfully evaded popular blacklists for at least a month.

Cuzzocrea, Alfredo, Martinelli, Fabio, Mercaldo, Francesco.  2018.  Applying Machine Learning Techniques to Detect and Analyze Web Phishing Attacks. Proceedings of the 20th International Conference on Information Integration and Web-Based Applications & Services. :355-359.

Phishing is a technique aimed to imitate an official websites of any company such as banks, institutes, etc. The purpose of phishing is to theft private and sensitive credentials of users such as password, username or PIN. Phishing detection is a technique to deal with this kind of malicious activity. In this paper we propose a method able to discriminate between web pages aimed to perform phishing attacks and legitimate ones. We exploit state of the art machine learning algorithms in order to build models using indicators that are able to detect phishing activities.

Vrban\v ci\v c, Grega, Fister, Jr., Iztok, Podgorelec, Vili.  2018.  Swarm Intelligence Approaches for Parameter Setting of Deep Learning Neural Network: Case Study on Phishing Websites Classification. Proceedings of the 8th International Conference on Web Intelligence, Mining and Semantics. :9:1-9:8.

In last decades, the web and online services have revolutionized the modern world. However, by increasing our dependence on online services, as a result, online security threats are also increasing rapidly. One of the most common online security threats is a so-called Phishing attack, the purpose of which is to mimic a legitimate website such as online banking, e-commerce or social networking website in order to obtain sensitive data such as user-names, passwords, financial and health-related information from potential victims. The problem of detecting phishing websites has been addressed many times using various methodologies from conventional classifiers to more complex hybrid methods. Recent advancements in deep learning approaches suggested that the classification of phishing websites using deep learning neural networks should outperform the traditional machine learning algorithms. However, the results of utilizing deep neural networks heavily depend on the setting of different learning parameters. In this paper, we propose a swarm intelligence based approach to parameter setting of deep learning neural network. By applying the proposed approach to the classification of phishing websites, we were able to improve their detection when compared to existing algorithms.

Hassanpour, Reza, Dogdu, Erdogan, Choupani, Roya, Goker, Onur, Nazli, Nazli.  2018.  Phishing E-Mail Detection by Using Deep Learning Algorithms. Proceedings of the ACMSE 2018 Conference. :45:1-45:1.

Phishing e-mails are considered as spam e-mails, which aim to collect sensitive personal information about the users via network. Since the main purpose of this behavior is mostly to harm users financially, it is vital to detect these phishing or spam e-mails immediately to prevent unauthorized access to users' vital information. To detect phishing e-mails, using a quicker and robust classification method is important. Considering the billions of e-mails on the Internet, this classification process is supposed to be done in a limited time to analyze the results. In this work, we present some of the early results on the classification of spam email using deep learning and machine methods. We utilize word2vec to represent emails instead of using the popular keyword or other rule-based methods. Vector representations are then fed into a neural network to create a learning model. We have tested our method on an open dataset and found over 96% accuracy levels with the deep learning classification methods in comparison to the standard machine learning algorithms.

Shirazi, Hossein, Bezawada, Bruhadeshwar, Ray, Indrakshi.  2018.  "Kn0W Thy Doma1N Name": Unbiased Phishing Detection Using Domain Name Based Features. Proceedings of the 23Nd ACM on Symposium on Access Control Models and Technologies. :69-75.

Phishing websites remain a persistent security threat. Thus far, machine learning approaches appear to have the best potential as defenses. But, there are two main concerns with existing machine learning approaches for phishing detection. The first is the large number of training features used and the lack of validating arguments for these feature choices. The second concern is the type of datasets used in the literature that are inadvertently biased with respect to the features based on the website URL or content. To address these concerns, we put forward the intuition that the domain name of phishing websites is the tell-tale sign of phishing and holds the key to successful phishing detection. Accordingly, we design features that model the relationships, visual as well as statistical, of the domain name to the key elements of a phishing website, which are used to snare the end-users. The main value of our feature design is that, to bypass detection, an attacker will find it very difficult to tamper with the visual content of the phishing website without arousing the suspicion of the end user. Our feature set ensures that there is minimal or no bias with respect to a dataset. Our learning model trains with only seven features and achieves a true positive rate of 98% and a classification accuracy of 97%, on sample dataset. Compared to the state-of-the-art work, our per data instance classification is 4 times faster for legitimate websites and 10 times faster for phishing websites. Importantly, we demonstrate the shortcomings of using features based on URLs as they are likely to be biased towards specific datasets. We show the robustness of our learning algorithm by testing on unknown live phishing URLs and achieve a high detection accuracy of \$99.7%\$.

Scheitle, Quirin, Gasser, Oliver, Nolte, Theodor, Amann, Johanna, Brent, Lexi, Carle, Georg, Holz, Ralph, Schmidt, Thomas C., Wählisch, Matthias.  2018.  The Rise of Certificate Transparency and Its Implications on the Internet Ecosystem. Proceedings of the Internet Measurement Conference 2018. :343-349.

In this paper, we analyze the evolution of Certificate Transparency (CT) over time and explore the implications of exposing certificate DNS names from the perspective of security and privacy. We find that certificates in CT logs have seen exponential growth. Website support for CT has also constantly increased, with now 33% of established connections supporting CT. With the increasing deployment of CT, there are also concerns of information leakage due to all certificates being visible in CT logs. To understand this threat, we introduce a CT honeypot and show that data from CT logs is being used to identify targets for scanning campaigns only minutes after certificate issuance. We present and evaluate a methodology to learn and validate new subdomains from the vast number of domains extracted from CT logged certificates.