Generalizing Text to Protect Privacy

pdf

Textual data can contain highly sensitive and identifying information; redaction is a difficult process that can make text unreadable and useless for many purposes. This poster describes an alternative: using ontologies to generalize words, resulting in text that is less sensitive, but still preserves meaning in a way that redacted data does not.

This material is based upon work supported by the National Science Foundation under Grant No. 1012208: TC:Large:Collaborative Research:Anonymizing Textual Data and its Impact on Utility. The poster reflects work with Prof. Wei Jiang (Missouri University of Science and Technology), Dr. Mummoorthy Murugesan (Teradata Corp.), Balamurugan Anandan (Purdue), and Pedro Pastrana-Camacho (Purdue). Co-PIs on the overall project also include Karen Chang (Purdue), Raquel Hill (Indiana University), Victor Raskin (Purdue), Stephanie Sanders (Indiana University), and Luo Si (Purdue).

Tags:

License: CC-2.5

Submitted by Katie Dey on Mon, 11/26/2012 - 22:27