Visible to the public Biblio

Filters: Author is Qi, H.  [Clear All Filters]
2018-06-11
Yang, C., Li, Z., Qu, W., Liu, Z., Qi, H..  2017.  Grid-Based Indexing and Search Algorithms for Large-Scale and High-Dimensional Data. 2017 14th International Symposium on Pervasive Systems, Algorithms and Networks 2017 11th International Conference on Frontier of Computer Science and Technology 2017 Third International Symposium of Creative Computing (ISPAN-FCST-ISCC). :46–51.

The rapid development of Internet has resulted in massive information overloading recently. These information is usually represented by high-dimensional feature vectors in many related applications such as recognition, classification and retrieval. These applications usually need efficient indexing and search methods for such large-scale and high-dimensional database, which typically is a challenging task. Some efforts have been made and solved this problem to some extent. However, most of them are implemented in a single machine, which is not suitable to handle large-scale database.In this paper, we present a novel data index structure and nearest neighbor search algorithm implemented on Apache Spark. We impose a grid on the database and index data by non-empty grid cells. This grid-based index structure is simple and easy to be implemented in parallel. Moreover, we propose to build a scalable KNN graph on the grids, which increase the efficiency of this index structure by a low cost in parallel implementation. Finally, experiments are conducted in both public databases and synthetic databases, showing that the proposed methods achieve overall high performance in both efficiency and accuracy.

2020-12-01
Li, W., Guo, D., Li, K., Qi, H., Zhang, J..  2018.  iDaaS: Inter-Datacenter Network as a Service. IEEE Transactions on Parallel and Distributed Systems. 29:1515—1529.

Increasing number of Internet-scale applications, such as video streaming, incur huge amount of wide area traffic. Such traffic over the unreliable Internet without bandwidth guarantee suffers unpredictable network performance. This result, however, is unappealing to the application providers. Fortunately, Internet giants like Google and Microsoft are increasingly deploying their private wide area networks (WANs) to connect their global datacenters. Such high-speed private WANs are reliable, and can provide predictable network performance. In this paper, we propose a new type of service-inter-datacenter network as a service (iDaaS), where traditional application providers can reserve bandwidth from those Internet giants to guarantee their wide area traffic. Specifically, we design a bandwidth trading market among multiple iDaaS providers and application providers, and concentrate on the essential bandwidth pricing problem. The involved challenging issue is that the bandwidth price of each iDaaS provider is not only influenced by other iDaaS providers, but also affected by the application providers. To address this issue, we characterize the interaction between iDaaS providers and application providers using a Stackelberg game model, and analyze the existence and uniqueness of the equilibrium. We further present an efficient bandwidth pricing algorithm by blending the advantage of a geometrical Nash bargaining solution and the demand segmentation method. For comparison, we present two bandwidth reservation algorithms, where each iDaaS provider's bandwidth is reserved in a weighted fair manner and a max-min fair manner, respectively. Finally, we conduct comprehensive trace-driven experiments. The evaluation results show that our proposed algorithms not only ensure the revenue of iDaaS providers, but also provide bandwidth guarantee for application providers with lower bandwidth price per unit.

2021-01-15
Li, Y., Yang, X., Sun, P., Qi, H., Lyu, S..  2020.  Celeb-DF: A Large-Scale Challenging Dataset for DeepFake Forensics. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). :3204—3213.
AI-synthesized face-swapping videos, commonly known as DeepFakes, is an emerging problem threatening the trustworthiness of online information. The need to develop and evaluate DeepFake detection algorithms calls for datasets of DeepFake videos. However, current DeepFake datasets suffer from low visual quality and do not resemble DeepFake videos circulated on the Internet. We present a new large-scale challenging DeepFake video dataset, Celeb-DF, which contains 5,639 high-quality DeepFake videos of celebrities generated using improved synthesis process. We conduct a comprehensive evaluation of DeepFake detection methods and datasets to demonstrate the escalated level of challenges posed by Celeb-DF.