Visible to the public Kubestorage: A Cloud Native Storage Engine for Massive Small Files

TitleKubestorage: A Cloud Native Storage Engine for Massive Small Files
Publication TypeConference Paper
Year of Publication2019
AuthorsLiu, F., Li, J., Wang, Y., Li, L.
Conference Name2019 6th International Conference on Behavioral, Economic and Socio-Cultural Computing (BESC)
Date Publishedoct
Keywordsapplication program interfaces, cloud computing, Cloud Native, Cloud Native applications, Cloud Native infrastructure, cloud native storage engine, cloud platform, cloud storage, compositionality, computing instances, Container Orchestration, Docker, emerging computing infrastructure, file system store metadata, Haystack, Haystack storage engine, high-frequency file storage needs, information retrieval, Kubernetes, large frequency file storage needs, meta data, Metadata Discovery Problem, network storage server, Object Storage, object storage model, Operating systems, orchestration system, pubcrawl, resilience, Resiliency, retrieving files, Scalability, storage management, traditional file system, traditional object storage solution
AbstractCloud Native, the emerging computing infrastructure has become a new trend for cloud computing, especially after the development of containerization technology such as docker and LXD, and the orchestration system for them like Kubernetes and Swarm. With the growing popularity of Cloud Native, the following problems have been raised: (i) most Cloud Native applications were designed for making full use of the cloud platform, but their file storage has not been completely optimized for adapting it. (ii) the traditional file system is designed as a utility for storing and retrieving files, usually built into the kernel of the operating systems. But when placing it to a large-scale condition, like a network storage server shared by thousands of computing instances, and stores millions of files, it will be slow and even unstable. (iii) most storage solutions use metadata for faster tracking of files, but the metadata itself will take up a lot of space, and the capacity of it is usually limited. If the file system store metadata directly into hard disk without caching, the tracking of massive small files will be a lot slower. (iv) The traditional object storage solution can't provide enough features to make itself more practical on the cloud such as caching and auto replication. This paper proposes a new storage engine based on the well-known Haystack storage engine, optimized in terms of service discovery and Automated fault tolerance, make it more suitable for Cloud Native infrastructure, deployment and applications. We use the object storage model to solve the large and high-frequency file storage needs, offering a simple and unified set of APIs for application to access. We also take advantage of Kubernetes' sophisticated and automated toolchains to make cloud storage easier to deploy, more flexible to scale, and more stable to run.
Citation Keyliu_kubestorage_2019