Visible to the public "Cross-Device Consumer Identification"Conflict Detection Enabled

Title"Cross-Device Consumer Identification"
Publication TypeConference Paper
Year of Publication2015
AuthorsG. Kejela, C. Rong
Conference Name2015 IEEE International Conference on Data Mining Workshop (ICDMW)
Date PublishedNov. 2015
ISBN Number978-1-4673-8493-3
Accession Number15774351
Keywordsadvertising companies, advertising data processing, Computational modeling, Computers, consumer behaviour, consumer identity, cross-device consumer identification, Data models, Decision trees, Deep Learning, Ensemble, GBDT, GBM, gradient boosting decision trees, ICDM2015 contest, Internet, IP networks, learning (artificial intelligence), performance evaluation, personal information, Predictive models, pubcrawl170105, Random Forest, random forest algorithm, Training, Xgboost

Nowadays, a typical household owns multiple digital devices that can be connected to the Internet. Advertising companies always want to seamlessly reach consumers behind devices instead of the device itself. However, the identity of consumers becomes fragmented as they switch from one device to another. A naive attempt is to use deterministic features such as user name, telephone number and email address. However consumers might refrain from giving away their personal information because of privacy and security reasons. The challenge in ICDM2015 contest is to develop an accurate probabilistic model for predicting cross-device consumer identity without using the deterministic user information. In this paper we present an accurate and scalable cross-device solution using an ensemble of Gradient Boosting Decision Trees (GBDT) and Random Forest. Our final solution ranks 9th both on the public and private LB with F0.5 score of 0.855.

Citation Key7395888