Visible to the public Large-Scale Readability Analysis of Privacy Policies

TitleLarge-Scale Readability Analysis of Privacy Policies
Publication TypeConference Paper
Year of Publication2017
AuthorsFabian, Benjamin, Ermakova, Tatiana, Lentz, Tino
Conference NameProceedings of the International Conference on Web Intelligence
Conference LocationNew York, NY, USA
ISBN Number978-1-4503-4951-2
KeywordsHuman Behavior, policy, privacy, Privacy Policies, pubcrawl, readability, Scalability, user experience

Online privacy policies notify users of a Website how their personal information is collected, processed and stored. Against the background of rising privacy concerns, privacy policies seem to represent an influential instrument for increasing customer trust and loyalty. However, in practice, consumers seem to actually read privacy policies only in rare cases, possibly reflecting the common assumption stating that policies are hard to comprehend. By designing and implementing an automated extraction and readability analysis toolset that embodies a diversity of established readability measures, we present the first large-scale study that provides current empirical evidence on the readability of nearly 50,000 privacy policies of popular English-speaking Websites. The results empirically confirm that on average, current privacy policies are still hard to read. Furthermore, this study presents new theoretical insights for readability research, in particular, to what extent practical readability measures are correlated. Specifically, it shows the redundancy of several well-established readability metrics such as SMOG, RIX, LIX, GFI, FKG, ARI, and FRES, thus easing future choice making processes and comparisons between readability studies, as well as calling for research towards a readability measures framework. Moreover, a more sophisticated privacy policy extractor and analyzer as well as a solid policy text corpus for further research are provided.

Citation Keyfabian_large-scale_2017