Visible to the public Biblio

Filters: Keyword is personal information  [Clear All Filters]
2021-01-28
Kumar, B. S., Daniya, T., Sathya, N., Cristin, R..  2020.  Investigation on Privacy Preserving using K-Anonymity Techniques. 2020 International Conference on Computer Communication and Informatics (ICCCI). :1—7.

In the current world, day by day the data growth and the investigation about that information increased due to the pervasiveness of computing devices, but people are reluctant to share their information on online portals or surveys fearing safety because sensitive information such as credit card information, medical conditions and other personal information in the wrong hands can mean danger to the society. These days privacy preserving has become a setback for storing data in data repository so for that reason data in the repository should be made undistinguishable, data is encrypted while storing and later decrypted when needed for analysis purpose in data mining. While storing the raw data of the individuals it is important to remove person-identifiable information such as name, employee id. However, the other attributes pertaining to the person should be encrypted so the methodologies used to implement. These methodologies can make data in the repository secure and PPDM task can made easier.

2020-12-28
Lee, H., Cho, S., Seong, J., Lee, S., Lee, W..  2020.  De-identification and Privacy Issues on Bigdata Transformation. 2020 IEEE International Conference on Big Data and Smart Computing (BigComp). :514—519.

As the number of data in various industries and government sectors is growing exponentially, the `7V' concept of big data aims to create a new value by indiscriminately collecting and analyzing information from various fields. At the same time as the ecosystem of the ICT industry arrives, big data utilization is treatened by the privacy attacks such as infringement due to the large amount of data. To manage and sustain the controllable privacy level, there need some recommended de-identification techniques. This paper exploits those de-identification processes and three types of commonly used privacy models. Furthermore, this paper presents use cases which can be adopted those kinds of technologies and future development directions.

2020-07-09
Kassem, Ali, Ács, Gergely, Castelluccia, Claude, Palamidessi, Catuscia.  2019.  Differential Inference Testing: A Practical Approach to Evaluate Sanitizations of Datasets. 2019 IEEE Security and Privacy Workshops (SPW). :72—79.

In order to protect individuals' privacy, data have to be "well-sanitized" before sharing them, i.e. one has to remove any personal information before sharing data. However, it is not always clear when data shall be deemed well-sanitized. In this paper, we argue that the evaluation of sanitized data should be based on whether the data allows the inference of sensitive information that is specific to an individual, instead of being centered around the concept of re-identification. We propose a framework to evaluate the effectiveness of different sanitization techniques on a given dataset by measuring how much an individual's record from the sanitized dataset influences the inference of his/her own sensitive attribute. Our intent is not to accurately predict any sensitive attribute but rather to measure the impact of a single record on the inference of sensitive information. We demonstrate our approach by sanitizing two real datasets in different privacy models and evaluate/compare each sanitized dataset in our framework.

2020-06-19
Chandra, Yogesh, Jana, Antoreep.  2019.  Improvement in Phishing Websites Detection Using Meta Classifiers. 2019 6th International Conference on Computing for Sustainable Global Development (INDIACom). :637—641.

In the era of the ever-growing number of smart devices, fraudulent practices through Phishing Websites have become an increasingly severe threat to modern computers and internet security. These websites are designed to steal the personal information from the user and spread over the internet without the knowledge of the user using the system. These websites give a false impression of genuinity to the user by mirroring the real trusted web pages which then leads to the loss of important credentials of the user. So, Detection of such fraudulent websites is an essence and the need of the hour. In this paper, various classifiers have been considered and were found that ensemble classifiers predict to utmost efficiency. The idea behind was whether a combined classifier model performs better than a single classifier model leading to a better efficiency and accuracy. In this paper, for experimentation, three Meta Classifiers, namely, AdaBoostM1, Stacking, and Bagging have been taken into consideration for performance comparison. It is found that Meta Classifier built by combining of simple classifier(s) outperform the simple classifier's performance.

2020-06-03
Cedillo, Priscila, Camacho, Jessica, Campos, Karina, Bermeo, Alexandra.  2019.  A Forensics Activity Logger to Extract User Activity from Mobile Devices. 2019 Sixth International Conference on eDemocracy eGovernment (ICEDEG). :286—290.

Nowadays, mobile devices have become one of the most popular instruments used by a person on its regular life, mainly due to the importance of their applications. In that context, mobile devices store user's personal information and even more data, becoming a personal tracker for daily activities that provides important information about the user. Derived from this gathering of information, many tools are available to use on mobile devices, with the restrain that each tool only provides isolated information about a specific application or activity. Therefore, the present work proposes a tool that allows investigators to obtain a complete report and timeline of the activities that were performed on the device. This report incorporates the information provided by many sources into a unique set of data. Also, by means of an example, it is presented the operation of the solution, which shows the feasibility in the use of this tool and shows the way in which investigators have to apply the tool.

2020-04-20
Liu, Kai-Cheng, Kuo, Chuan-Wei, Liao, Wen-Chiuan, Wang, Pang-Chieh.  2018.  Optimized Data de-Identification Using Multidimensional k-Anonymity. 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/ 12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). :1610–1614.
In the globalized knowledge economy, big data analytics have been widely applied in diverse areas. A critical issue in big data analysis on personal information is the possible leak of personal privacy. Therefore, it is necessary to have an anonymization-based de-identification method to avoid undesirable privacy leak. Such method can prevent published data form being traced back to personal privacy. Prior empirical researches have provided approaches to reduce privacy leak risk, e.g. Maximum Distance to Average Vector (MDAV), Condensation Approach and Differential Privacy. However, previous methods inevitably generate synthetic data of different sizes and is thus unsuitable for general use. To satisfy the need of general use, k-anonymity can be chosen as a privacy protection mechanism in the de-identification process to ensure the data not to be distorted, because k-anonymity is strong in both protecting privacy and preserving data authenticity. Accordingly, this study proposes an optimized multidimensional method for anonymizing data based on both the priority weight-adjusted method and the mean difference recommending tree method (MDR tree method). The results of this study reveal that this new method generate more reliable anonymous data and reduce the information loss rate.
Liu, Kai-Cheng, Kuo, Chuan-Wei, Liao, Wen-Chiuan, Wang, Pang-Chieh.  2018.  Optimized Data de-Identification Using Multidimensional k-Anonymity. 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/ 12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). :1610–1614.
In the globalized knowledge economy, big data analytics have been widely applied in diverse areas. A critical issue in big data analysis on personal information is the possible leak of personal privacy. Therefore, it is necessary to have an anonymization-based de-identification method to avoid undesirable privacy leak. Such method can prevent published data form being traced back to personal privacy. Prior empirical researches have provided approaches to reduce privacy leak risk, e.g. Maximum Distance to Average Vector (MDAV), Condensation Approach and Differential Privacy. However, previous methods inevitably generate synthetic data of different sizes and is thus unsuitable for general use. To satisfy the need of general use, k-anonymity can be chosen as a privacy protection mechanism in the de-identification process to ensure the data not to be distorted, because k-anonymity is strong in both protecting privacy and preserving data authenticity. Accordingly, this study proposes an optimized multidimensional method for anonymizing data based on both the priority weight-adjusted method and the mean difference recommending tree method (MDR tree method). The results of this study reveal that this new method generate more reliable anonymous data and reduce the information loss rate.
2020-03-18
Kumar Mangi, S.V.V. Satya Surya Sravan, Hussian S.K., Saddam, Leelavathy, N..  2019.  An Approach for Sending a Confidential Message to the Restricted Users in Defence Based Organization. 2019 International Conference on Vision Towards Emerging Trends in Communication and Networking (ViTECoN). :1–5.
After the creation of the internet, the file sharing process has been changed. Several third-party applications have come to live for sharing and chatting purposes. A spammer can profit by these applications in different ways like, can achieve countless data, can acquire the user's personal information, and furthermore. Later that untrusted cloud storages are used for uploading a file even it is maintained by the third party If they use an untrusted cloud, there is a security problem. We need to give more security for file transfer in the defense-based organization. So, we developed a secure application for group member communication in a secure medium. The user belongs to a specific department from a specific group can access the data from the storage node and decrypt it. Every user in the group needs to register in the node to send or receive the data. Group Manager can restrict the access of the users in a Defense Network and he generates a user list, users in that list can only login to the node and share or download the files. We created a secure platform to upload files and share the data with multiple users by using Dynamic broadcasting Encryption. Users in the list can only download and decrypt the files from the storage node.
2019-12-30
Razaque, Abdul, Jinrui, Wang, Zancheng, Wang, Hani, Qassim Bani, Khaskheli, Murad Ali, Bhutto, Waseem Ahmed.  2018.  Integration of CPU and GPU to Accelerate RSA Modular Exponentiation Operation. 2018 IEEE Long Island Systems, Applications and Technology Conference (LISAT). :1-6.

Now-a-days, the security of data becomes more and more important, people store many personal information in their phones. However, stored information require security and maintain privacy. Encryption algorithm has become the main force of maintaining the security of data. Thus, the algorithm complexity and encryption efficiency have become the main measurement of whether the encryption algorithm is save or not. With the development of hardware, we have many tools to improve the algorithm at present. Because modular exponentiation in RSA algorithm can be divided into several parts mathematically. In this paper, we introduce a conception by dividing the process of encryption and add the model into graphics process unit (GPU). By using GPU's capacity in parallel computing, the core of RSA can be accelerated by using central process unit (CPU) and GPU. Compute unified device architecture (CUDA) is a platform which can combine CPU and GPU together to realize GPU parallel programming and this is the tool we use to perform experience of accelerating RSA algorithm. This paper will also build up a mathematical model to help understand the mechanism of RSA encryption algorithm.

Morita, Kazunari, Yoshimura, Hiroki, Nishiyama, Masashi, Iwai, Yoshio.  2018.  Protecting Personal Information using Homomorphic Encryption for Person Re-identification. 2018 IEEE 7th Global Conference on Consumer Electronics (GCCE). :166–167.
We investigate how to protect features corresponding to personal information using homomorphic encryption when matching people in several camera views. Homomorphic encryption can compute a distance between features without decryption. Thus, our method is able to use a computing server on a public network while protecting personal information. To apply homomorphic encryption, our method uses linear quantization to represent each element of the feature as integers. Experimental results show that there is no significant difference in the accuracy of person re-identification with or without homomorphic encryption and linear quantization.
2019-11-26
Zabihimayvan, Mahdieh, Doran, Derek.  2019.  Fuzzy Rough Set Feature Selection to Enhance Phishing Attack Detection. 2019 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE). :1-6.

Phishing as one of the most well-known cybercrime activities is a deception of online users to steal their personal or confidential information by impersonating a legitimate website. Several machine learning-based strategies have been proposed to detect phishing websites. These techniques are dependent on the features extracted from the website samples. However, few studies have actually considered efficient feature selection for detecting phishing attacks. In this work, we investigate an agreement on the definitive features which should be used in phishing detection. We apply Fuzzy Rough Set (FRS) theory as a tool to select most effective features from three benchmarked data sets. The selected features are fed into three often used classifiers for phishing detection. To evaluate the FRS feature selection in developing a generalizable phishing detection, the classifiers are trained by a separate out-of-sample data set of 14,000 website samples. The maximum F-measure gained by FRS feature selection is 95% using Random Forest classification. Also, there are 9 universal features selected by FRS over all the three data sets. The F-measure value using this universal feature set is approximately 93% which is a comparable result in contrast to the FRS performance. Since the universal feature set contains no features from third-part services, this finding implies that with no inquiry from external sources, we can gain a faster phishing detection which is also robust toward zero-day attacks.

Patil, Srushti, Dhage, Sudhir.  2019.  A Methodical Overview on Phishing Detection along with an Organized Way to Construct an Anti-Phishing Framework. 2019 5th International Conference on Advanced Computing Communication Systems (ICACCS). :588-593.

Phishing is a security attack to acquire personal information like passwords, credit card details or other account details of a user by means of websites or emails. Phishing websites look similar to the legitimate ones which make it difficult for a layman to differentiate between them. As per the reports of Anti Phishing Working Group (APWG) published in December 2018, phishing against banking services and payment processor was high. Almost all the phishy URLs use HTTPS and use redirects to avoid getting detected. This paper presents a focused literature survey of methods available to detect phishing websites. A comparative study of the in-use anti-phishing tools was accomplished and their limitations were acknowledged. We analyzed the URL-based features used in the past to improve their definitions as per the current scenario which is our major contribution. Also, a step wise procedure of designing an anti-phishing model is discussed to construct an efficient framework which adds to our contribution. Observations made out of this study are stated along with recommendations on existing systems.

2019-09-26
Kodera, Y., Kuribayashi, M., Kusaka, T., Nogami, Y..  2018.  Advanced Searchable Encryption: Keyword Search for Matrix-Type Storage. 2018 Sixth International Symposium on Computing and Networking Workshops (CANDARW). :292-297.
The recent development of IoT technologies and cloud storages, many types of information including private information have been gradually outsourced. For such a situation, new convenient functionalities such as arithmetic and keyword search on ciphertexts are required to allow users to retrieve information without leaking any information. Especially, searchable encryptions have been paid much attention to realize a keyword search on an encrypted domain. In addition, an architecture of searchable symmetric encryption (SSE) is a suitable and efficient solution for data outsourcing. In this paper, we focus on an SSE scheme which employs a secure index for searching a keyword with optimal search time. In the conventional studies, it has been widely considered that the scheme searches whether a queried keyword is contained in encrypted documents. On the other hand, we additionally take into account the location of a queried keyword in documents by targeting a matrix-type data format. It enables a manager to search personal information listed per line or column in CSV-like format data.
Yoshikawa, M., Ikezaki, Y., Nozaki, Y..  2018.  Implementation of Searchable Encryption System with Dedicated Hardware and Its Evaluation. 2018 9th IEEE Annual Ubiquitous Computing, Electronics Mobile Communication Conference (UEMCON). :218-221.
Recently, big data and artificial intelligence (AI) have been introduced into medical services. When personal information is stored in a shared database, that data must be encrypted, which, in turn, makes it difficult to extract only the necessary information. Searchable encryption has now been proposed to extract, or search, encrypted data without decrypting it. However, all previous studies regarding searchable encryption are software-based. This paper proposes a searchable encryption system embedded in dedicated hardware and evaluates its circuit size.
2019-07-01
Ha\c silo\u glu, A., Bali, A..  2018.  Central Audit Logging Mechanism in Personal Data Web Services. 2018 6th International Symposium on Digital Forensic and Security (ISDFS). :1-3.

Personal data have been compiled and harnessed by a great number of establishments to execute their legal activities. Establishments are legally bound to maintain the confidentiality and security of personal data. Hence it is a requirement to provide access logs for the personal information. Depending on the needs and capacity, personal data can be opened to the users via platforms such as file system, database and web service. Web service platform is a popular alternative since it is autonomous and can isolate the data source from the user. In this paper, the way to log personal data accessed via web service method has been discussed. As an alternative to classical method in which logs were recorded and saved by client applications, a different mechanism of forming a central audit log with API manager has been investigated. By forging a model policy to exemplify central logging method, its advantages and disadvantages have been explored. It has been concluded in the end that this model could be employed in centrally recording audit logs.

2019-03-28
Joo, M., Seo, J., Oh, J., Park, M., Lee, K..  2018.  Situational Awareness Framework for Cyber Crime Prevention Model in Cyber Physical System. 2018 Tenth International Conference on Ubiquitous and Future Networks (ICUFN). :837-842.

Recently, IoT, 5G mobile, big data, and artificial intelligence are increasingly used in the real world. These technologies are based on convergenced in Cyber Physical System(Cps). Cps technology requires core technologies to ensure reliability, real-time, safety, autonomy, and security. CPS is the system that can connect between cyberspace and physical space. Cyberspace attacks are confused in the real world and have a lot of damage. The personal information that dealing in CPS has high confidentiality, so the policies and technique will needed to protect the attack in advance. If there is an attack on the CPS, not only personal information but also national confidential data can be leaked. In order to prevent this, the risk is measured using the Factor Analysis of Information Risk (FAIR) Model, which can measure risk by element for situational awareness in CPS environment. To reduce risk by preventing attacks in CPS, this paper measures risk after using the concept of Crime Prevention Through Environmental Design(CPTED).

2019-02-14
Schuette, J., Brost, G. S..  2018.  LUCON: Data Flow Control for Message-Based IoT Systems. 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/ 12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). :289-299.

Today's emerging Industrial Internet of Things (IIoT) scenarios are characterized by the exchange of data between services across enterprises. Traditional access and usage control mechanisms are only able to determine if data may be used by a subject, but lack an understanding of how it may be used. The ability to control the way how data is processed is however crucial for enterprises to guarantee (and provide evidence of) compliant processing of critical data, as well as for users who need to control if their private data may be analyzed or linked with additional information - a major concern in IoT applications processing personal information. In this paper, we introduce LUCON, a data-centric security policy framework for distributed systems that considers data flows by controlling how messages may be routed across services and how they are combined and processed. LUCON policies prevent information leaks, bind data usage to obligations, and enforce data flows across services. Policy enforcement is based on a dynamic taint analysis at runtime and an upfront static verification of message routes against policies. We discuss the semantics of these two complementing enforcement models and illustrate how LUCON policies are compiled from a simple policy language into a first-order logic representation. We demonstrate the practical application of LUCON in a real-world IoT middleware and discuss its integration into Apache Camel. Finally, we evaluate the runtime impact of LUCON and discuss performance and scalability aspects.

2018-09-05
Mayle, A., Bidoki, N. H., Masnadi, S., Boeloeni, L., Turgut, D..  2017.  Investigating the Value of Privacy within the Internet of Things. GLOBECOM 2017 - 2017 IEEE Global Communications Conference. :1–6.

Many companies within the Internet of Things (IoT) sector rely on the personal data of users to deliver and monetize their services, creating a high demand for personal information. A user can be seen as making a series of transactions, each involving the exchange of personal data for a service. In this paper, we argue that privacy can be described quantitatively, using the game- theoretic concept of value of information (VoI), enabling us to assess whether each exchange is an advantageous one for the user. We introduce PrivacyGate, an extension to the Android operating system built for the purpose of studying privacy of IoT transactions. An example study, and its initial results, are provided to illustrate its capabilities.

2018-01-23
Hoel, Tore, Griffiths, Dai, Chen, Weiqin.  2017.  The Influence of Data Protection and Privacy Frameworks on the Design of Learning Analytics Systems. Proceedings of the Seventh International Learning Analytics & Knowledge Conference. :243–252.

Learning analytics open up a complex landscape of privacy and policy issues, which, in turn, influence how learning analytics systems and practices are designed. Research and development is governed by regulations for data storage and management, and by research ethics. Consequently, when moving solutions out the research labs implementers meet constraints defined in national laws and justified in privacy frameworks. This paper explores how the OECD, APEC and EU privacy frameworks seek to regulate data privacy, with significant implications for the discourse of learning, and ultimately, an impact on the design of tools, architectures and practices that now are on the drawing board. A detailed list of requirements for learning analytics systems is developed, based on the new legal requirements defined in the European General Data Protection Regulation, which from 2018 will be enforced as European law. The paper also gives an initial account of how the privacy discourse in Europe, Japan, South-Korea and China is developing and reflects upon the possible impact of the different privacy frameworks on the design of LA privacy solutions in these countries. This research contributes to knowledge of how concerns about privacy and data protection related to educational data can drive a discourse on new approaches to privacy engineering based on the principles of Privacy by Design. For the LAK community, this study represents the first attempt to conceptualise the issues of privacy and learning analytics in a cross-cultural context. The paper concludes with a plan to follow up this research on privacy policies and learning analytics systems development with a new international study.

Joo, Moon-Ho, Yoon, Sang-Pil, Kim, Sahng-Yoon, Kwon, Hun-Yeong.  2017.  Research on Distribution of Responsibility for De-Identification Policy of Personal Information. Proceedings of the 18th Annual International Conference on Digital Government Research. :74–83.
With the coming of the age of big data, efforts to institutionalize de-identification of personal information to protect privacy but also at the same time, to allow the use of personal information, have been actively carried out and already, many countries are in the stage of implementing and establishing de-identification policies quite actively. But even with such efforts to protect and use personal information at the same time, the danger posed by re-identification based on de-identified information is real enough to warrant serious consideration for a management mechanism of such risks as well as a mechanism for distributing the responsibilities and liabilities that follow these risks in the event of accidents and incidents involving the invasion of privacy. So far, most countries implementing the de-identification policies are focusing on defining what de-identification is and the exemption requirements to allow free use of de-identified personal information; in fact, it seems that there is a lack of discussion and consideration on how to distribute the responsibility of the risks and liabilities involved in the process of de-identification of personal information. This study proposes to take a look at the various de-identification policies worldwide and contemplate on these policies in the perspective of risk-liability theory. Also, the constituencies of the de-identification policies will be identified in order to analyze the roles and responsibilities of each of these constituencies thereby providing the theoretical basis on which to initiate the discussions on the distribution of burden and responsibilities arising from the de-identification policies.
2017-11-13
Nakamura, Y., Louvel, M., Nishi, H..  2016.  Coordination middleware for secure wireless sensor networks. IECON 2016 - 42nd Annual Conference of the IEEE Industrial Electronics Society. :6931–6936.

Wireless sensor networks (WSNs) are implemented in various Internet-of-Things applications such as energy management systems. As the applications may involve personal information, they must be protected from attackers attempting to read information or control network devices. Research on WSN security is essential to protect WSNs from attacks. Studies in such research domains propose solutions against the attacks. However, they focus mainly on the security measures rather than on their ease in implementation in WSNs. In this paper, we propose a coordination middleware that provides an environment for constructing updatable WSNs for security. The middleware is based on LINC, a rule-based coordination middleware. The proposed approach allows the development of WSNs and attaches or detaches security modules when required. We implemented three security modules on LINC and on a real network, as case studies. Moreover, we evaluated the implementation costs while comparing the case studies.

2017-03-08
Alotaibi, S., Furnell, S., Clarke, N..  2015.  Transparent authentication systems for mobile device security: A review. 2015 10th International Conference for Internet Technology and Secured Transactions (ICITST). :406–413.

Sensitive data such as text messages, contact lists, and personal information are stored on mobile devices. This makes authentication of paramount importance. More security is needed on mobile devices since, after point-of-entry authentication, the user can perform almost all tasks without having to re-authenticate. For this reason, many authentication methods have been suggested to improve the security of mobile devices in a transparent and continuous manner, providing a basis for convenient and secure user re-authentication. This paper presents a comprehensive analysis and literature review on transparent authentication systems for mobile device security. This review indicates a need to investigate when to authenticate the mobile user by focusing on the sensitivity level of the application, and understanding whether a certain application may require a protection or not.

2017-02-27
Chessa, M., Grossklags, J., Loiseau, P..  2015.  A Game-Theoretic Study on Non-monetary Incentives in Data Analytics Projects with Privacy Implications. 2015 IEEE 28th Computer Security Foundations Symposium. :90–104.

The amount of personal information contributed by individuals to digital repositories such as social network sites has grown substantially. The existence of this data offers unprecedented opportunities for data analytics research in various domains of societal importance including medicine and public policy. The results of these analyses can be considered a public good which benefits data contributors as well as individuals who are not making their data available. At the same time, the release of personal information carries perceived and actual privacy risks to the contributors. Our research addresses this problem area. In our work, we study a game-theoretic model in which individuals take control over participation in data analytics projects in two ways: 1) individuals can contribute data at a self-chosen level of precision, and 2) individuals can decide whether they want to contribute at all (or not). From the analyst's perspective, we investigate to which degree the research analyst has flexibility to set requirements for data precision, so that individuals are still willing to contribute to the project, and the quality of the estimation improves. We study this tradeoffs scenario for populations of homogeneous and heterogeneous individuals, and determine Nash equilibrium that reflect the optimal level of participation and precision of contributions. We further prove that the analyst can substantially increase the accuracy of the analysis by imposing a lower bound on the precision of the data that users can reveal.

2017-02-23
G. Kejela, C. Rong.  2015.  "Cross-Device Consumer Identification". 2015 IEEE International Conference on Data Mining Workshop (ICDMW). :1687-1689.

Nowadays, a typical household owns multiple digital devices that can be connected to the Internet. Advertising companies always want to seamlessly reach consumers behind devices instead of the device itself. However, the identity of consumers becomes fragmented as they switch from one device to another. A naive attempt is to use deterministic features such as user name, telephone number and email address. However consumers might refrain from giving away their personal information because of privacy and security reasons. The challenge in ICDM2015 contest is to develop an accurate probabilistic model for predicting cross-device consumer identity without using the deterministic user information. In this paper we present an accurate and scalable cross-device solution using an ensemble of Gradient Boosting Decision Trees (GBDT) and Random Forest. Our final solution ranks 9th both on the public and private LB with F0.5 score of 0.855.

2017-02-14
J. Choi, C. Choi, H. M. Lynn, P. Kim.  2015.  "Ontology Based APT Attack Behavior Analysis in Cloud Computing". 2015 10th International Conference on Broadband and Wireless Computing, Communication and Applications (BWCCA). :375-379.

Recently personal information due to the APT attack, the economic damage and leakage of confidential information is a serious social problem, a great deal of research has been done to solve this problem. APT attacks are threatening traditional hacking techniques as well as to increase the success rate of attacks using sophisticated attack techniques such attacks Zero-Day vulnerability in order to avoid detection techniques and state-of-the-art security because it uses a combination of intelligence. In this paper, the malicious code is designed to detect APT attack based on APT attack behavior ontology that occur during the operation on the target system, it uses intelligent APT attack than to define inference rules can be inferred about malicious attack behavior to propose a method that can be detected.