Visible to the public Biblio

Filters: Author is Zhang, Feng  [Clear All Filters]
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 
Z
Zhang, Feng, Zhai, Jidong, Shen, Xipeng, Mutlu, Onur, Chen, Wenguang.  2018.  Zwift: A Programming Framework for High Performance Text Analytics on Compressed Data. Proceedings of the 2018 International Conference on Supercomputing. :195-206.
Today's rapidly growing document volumes pose pressing challenges to modern document analytics frameworks, in both space usage and processing time. Recently, a promising method, called text analytics directly on compressed data (TADOC), was proposed for improving both the time and space efficiency of text analytics. The main idea of the technique is to enable direct document analytics on compressed data. This paper focuses on the programming challenges for developing efficient TADOC programs. It presents Zwift, the first programming framework for TADOC, which consists of a Domain Specific Language, a compiler and runtime, and a utility library. Experiments show that Zwift significantly improves programming productivity, while effectively unleashing the power of TADOC, producing code that reduces storage usage by 90.8% and execution time by 41.0% on six text analytics problems.