Visible to the public Biblio

Filters: Keyword is probabilistic framework  [Clear All Filters]
Conference Paper
Ya Zhang, Yi Wei, Jianbiao Ren.  2014.  Multi-touch Attribution in Online Advertising with Survival Theory. Data Mining (ICDM), 2014 IEEE International Conference on. :687-696.

Multi-touch attribution, which allows distributing the credit to all related advertisements based on their corresponding contributions, has recently become an important research topic in digital advertising. Traditionally, rule-based attribution models have been used in practice. The drawback of such rule-based models lies in the fact that the rules are not derived form the data but only based on simple intuition. With the ever enhanced capability to tracking advertisement and users' interaction with the advertisement, data-driven multi-touch attribution models, which attempt to infer the contribution from user interaction data, become an important research direction. We here propose a new data-driven attribution model based on survival theory. By adopting a probabilistic framework, one key advantage of the proposed model is that it is able to remove the presentation biases inherit to most of the other attribution models. In addition to model the attribution, the proposed model is also able to predict user's 'conversion' probability. We validate the proposed method with a real-world data set obtained from a operational commercial advertising monitoring company. Experiment results have shown that the proposed method is quite promising in both conversion prediction and attribution.

Karmaker Santu, Shubhra Kanti, Bindschadler, Vincent, Zhai, ChengXiang, Gunter, Carl A..  2018.  NRF: A Naive Re-Identification Framework. Proceedings of the 2018 Workshop on Privacy in the Electronic Society. :121-132.

The promise of big data relies on the release and aggregation of data sets. When these data sets contain sensitive information about individuals, it has been scalable and convenient to protect the privacy of these individuals by de-identification. However, studies show that the combination of de-identified data sets with other data sets risks re-identification of some records. Some studies have shown how to measure this risk in specific contexts where certain types of public data sets (such as voter roles) are assumed to be available to attackers. To the extent that it can be accomplished, such analyses enable the threat of compromises to be balanced against the benefits of sharing data. For example, a study that might save lives by enabling medical research may be enabled in light of a sufficiently low probability of compromise from sharing de-identified data. In this paper, we introduce a general probabilistic re-identification framework that can be instantiated in specific contexts to estimate the probability of compromises based on explicit assumptions. We further propose a baseline of such assumptions that enable a first-cut estimate of risk for practical case studies. We refer to the framework with these assumptions as the Naive Re-identification Framework (NRF). As a case study, we show how we can apply NRF to analyze and quantify the risk of re-identification arising from releasing de-identified medical data in the context of publicly-available social media data. The results of this case study show that NRF can be used to obtain meaningful quantification of the re-identification risk, compare the risk of different social media, and assess risks of combinations of various demographic attributes and medical conditions that individuals may voluntarily disclose on social media.