Visible to the public Biblio

Filters: Author is Park, Noseong  [Clear All Filters]
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 
K
Kang, Chanhyun, Park, Noseong, Prakash, B. Aditya, Serra, Edoardo, Subrahmanian, V. S..  2016.  Ensemble Models for Data-driven Prediction of Malware Infections. Proceedings of the Ninth ACM International Conference on Web Search and Data Mining. :583–592.

Given a history of detected malware attacks, can we predict the number of malware infections in a country? Can we do this for different malware and countries? This is an important question which has numerous implications for cyber security, right from designing better anti-virus software, to designing and implementing targeted patches to more accurately measuring the economic impact of breaches. This problem is compounded by the fact that, as externals, we can only detect a fraction of actual malware infections. In this paper we address this problem using data from Symantec covering more than 1.4 million hosts and 50 malware spread across 2 years and multiple countries. We first carefully design domain-based features from both malware and machine-hosts perspectives. Secondly, inspired by epidemiological and information diffusion models, we design a novel temporal non-linear model for malware spread and detection. Finally we present ESM, an ensemble-based approach which combines both these methods to construct a more accurate algorithm. Using extensive experiments spanning multiple malware and countries, we show that ESM can effectively predict malware infection ratios over time (both the actual number and trend) upto 4 times better compared to several baselines on various metrics. Furthermore, ESM's performance is stable and robust even when the number of detected infections is low.

Kang, Hyunjoong, Hong, Sanghyun, Lee, Kookjin, Park, Noseong, Kwon, Soonhyun.  2018.  On Integrating Knowledge Graph Embedding into SPARQL Query Processing. 2018 IEEE International Conference on Web Services (ICWS). :371—374.
SPARQL is a standard query language for knowledge graphs (KGs). However, it is hard to find correct answer if KGs are incomplete or incorrect. Knowledge graph embedding (KGE) enables answering queries on such KGs by inferring unknown knowledge and removing incorrect knowledge. Hence, our long-term goal in this line of research is to propose a new framework that integrates KGE and SPARQL, which opens various research problems to be addressed. In this paper, we solve one of the most critical problems, that is, optimizing the performance of nearest neighbor (NN) search. In our evaluations, we demonstrate that the search time of state-of-the-art NN search algorithms is improved by 40% without sacrificing answer accuracy.