Visible to the public Process-Driven Data PrivacyConflict Detection Enabled

TitleProcess-Driven Data Privacy
Publication TypeConference Paper
Year of Publication2015
AuthorsXia, Weiyi, Kantarcioglu, Murat, Wan, Zhiyu, Heatherly, Raymond, Vorobeychik, Yevgeniy, Malin, Bradley
Conference NameProceedings of the 24th ACM International on Conference on Information and Knowledge Management
Conference LocationMelbourne, Australia
ISBN Number978-1-4503-3794-6
KeywordsFoundations, Hierarchical Coordination and Control, Planning, privacy, re-identification, Resilient Systems, Science of decentralized security, science of security, SURE Project

The quantity of personal data gathered by service providers via our daily activities continues to grow at a rapid pace. The sharing, and the subsequent analysis of, such data can support a wide range of activities, but concerns around privacy often prompt an organization to transform the data to meet certain protection models (e.g., k-anonymity or E-differential privacy). These models, however, are based on simplistic adversarial frameworks, which can lead to both under- and over-protection. For instance, such models often assume that an adversary attacks a protected record exactly once. We introduce a principled approach to explicitly model the attack process as a series of steps. Specically, we engineer a factored Markov decision process (FMDP) to optimally plan an attack from the adversary's perspective and assess the privacy risk accordingly. The FMDP captures the uncertainty in the adversary's belief (e.g., the number of identied individuals that match the de-identified data) and enables the analysis of various real world deterrence mechanisms beyond a traditional protection model, such as a penalty for committing an attack. We present an algorithm to solve the FMDP and illustrate its efficiency by simulating an attack on publicly accessible U.S. census records against a real identied resource of over 500,000 individuals in a voter registry. Our results demonstrate that while traditional privacy models commonly expect an adversary to attack exactly once per record, an optimal attack in our model may involve exploiting none, one, or more indiviuals in the pool of candidates, depending on context.

Citation KeyXia:2015:PDP:2806416.2806580