Visible to the public Securing Big Data in the Age of AI

TitleSecuring Big Data in the Age of AI
Publication TypeConference Paper
Year of Publication2019
AuthorsKantarcioglu, Murat, Shaon, Fahad
Conference Name2019 First IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA)
ISBN Number978-1-7281-6741-1
Keywordsartificial intelligence, Big Data, Collaboration, composability, compositionality, Data models, data privacy, Data security, Human Behavior, Intelligent Data and Security, Intelligent Data Security, machine learning, Metrics, NoSQL databases, Policy Based Governance, privacy, pubcrawl, relational database security, Resiliency, Scalability, security

Increasingly organizations are collecting ever larger amounts of data to build complex data analytics, machine learning and AI models. Furthermore, the data needed for building such models may be unstructured (e.g., text, image, and video). Hence such data may be stored in different data management systems ranging from relational databases to newer NoSQL databases tailored for storing unstructured data. Furthermore, data scientists are increasingly using programming languages such as Python, R etc. to process data using many existing libraries. In some cases, the developed code will be automatically executed by the NoSQL system on the stored data. These developments indicate the need for a data security and privacy solution that can uniformly protect data stored in many different data management systems and enforce security policies even if sensitive data is processed using a data scientist submitted complex program. In this paper, we introduce our vision for building such a solution for protecting big data. Specifically, our proposed system system allows organizations to 1) enforce policies that control access to sensitive data, 2) keep necessary audit logs automatically for data governance and regulatory compliance, 3) sanitize and redact sensitive data on-the-fly based on the data sensitivity and AI model needs, 4) detect potentially unauthorized or anomalous access to sensitive data, 5) automatically create attribute-based access control policies based on data sensitivity and data type.

Citation Keykantarcioglu_securing_2019