Visible to the public Making the Pedigree to Your Big Data Repository: Innovative Methods, Solutions, and Algorithms for Supporting Big Data Privacy in Distributed Settings via Data-Driven Paradigms

TitleMaking the Pedigree to Your Big Data Repository: Innovative Methods, Solutions, and Algorithms for Supporting Big Data Privacy in Distributed Settings via Data-Driven Paradigms
Publication TypeConference Paper
Year of Publication2019
AuthorsCuzzocrea, Alfredo, Damiani, Ernesto
Conference Name2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC)
KeywordsBig Data, big data privacy, cloud computing, cloud settings, data aggregation, Data analysis, Data models, data privacy, data-driven aggregate-provenance privacy-preserving big multidimensional data, Distributed databases, distributed settings, distributed-big-data, DRIPROM framework, human factors, Metrics, privacy-of-big-data, Proposals, protocol checking, Protocols, pubcrawl, Resiliency, Scalability, security of data, security-inspired protocols
AbstractStarting from our previous research where we in- troduced a general framework for supporting data-driven privacy-preserving big data management in distributed environments, such as emerging Cloud settings, in this paper we further and significantly extend our past research contributions, and provide several novel contributions that complement our previous work in the investigated research field. Our proposed framework can be viewed as an alternative to classical approaches where the privacy of big data is ensured via security-inspired protocols that check several (protocol) layers in order to achieve the desired privacy. Unfortunately, this injects considerable computational overheads in the overall process, thus introducing relevant challenges to be considered. Our approach instead tries to recognize the “pedigree” of suitable summary data representatives computed on top of the target big data repositories, hence avoiding computational overheads due to protocol checking. We also provide a relevant realization of the framework above, the so- called Data-dRIven aggregate-PROvenance privacy-preserving big Multidimensional data (DRIPROM) framework, which specifically considers multidimensional data as the case of interest. Extensions and discussion on main motivations and principles of our proposed research, two relevant case studies that clearly state the need-for and covered (related) properties of supporting privacy- preserving management and analytics of big data in modern distributed systems, and an experimental assessment and analysis of our proposed DRIPROM framework are the major results of this paper.
DOI10.1109/COMPSAC.2019.10257
Citation Keycuzzocrea_making_2019