Visible to the public Design and implementation of HDFS data encryption scheme using ARIA algorithm on Hadoop

TitleDesign and implementation of HDFS data encryption scheme using ARIA algorithm on Hadoop
Publication TypeConference Paper
Year of Publication2017
AuthorsSong, Youngho, Shin, Young-sung, Jang, Miyoung, Chang, Jae-Woo
Conference Name2017 IEEE International Conference on Big Data and Smart Computing (BigComp)
ISBN Number978-1-5090-3015-6
KeywordsAES algorithms, AES international standard data encryption algorithm, Algorithm design and analysis, ARIA algorithm, ARIA Encryption Algorithm, ARIA/AES encryption, Big Data, Codecs, composability, cryptography, cyber physical systems, data block, data encryption, Data processing, distributed data processing platform, Distributed databases, domestic usages, efficient encryption, Encryption, Hadoop distributed computing, Hadoop Encryption Codec, Hadoop Security, HDFS block-splitting component, HDFS Data Encryption, HDFS data encryption scheme, hierarchical clustering, k-means, Korean government, pubcrawl, resilience, Resiliency, Sorting, Standards, variable-length data processing, word counting

Hadoop is developed as a distributed data processing platform for analyzing big data. Enterprises can analyze big data containing users' sensitive information by using Hadoop and utilize them for their marketing. Therefore, researches on data encryption have been widely done to protect the leakage of sensitive data stored in Hadoop. However, the existing researches support only the AES international standard data encryption algorithm. Meanwhile, the Korean government selected ARIA algorithm as a standard data encryption scheme for domestic usages. In this paper, we propose a HDFS data encryption scheme which supports both ARIA and AES algorithms on Hadoop. First, the proposed scheme provides a HDFS block-splitting component that performs ARIA/AES encryption and decryption under the Hadoop distributed computing environment. Second, the proposed scheme provides a variable-length data processing component that can perform encryption and decryption by adding dummy data, in case when the last data block does not contains 128-bit data. Finally, we show from performance analysis that our proposed scheme is efficient for various applications, such as word counting, sorting, k-Means, and hierarchical clustering.

Citation Keysong_design_2017