Visible to the public Biblio

Filters: Author is Shen, Heng Tao  [Clear All Filters]
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 
D
Shen, Fumin, Gao, Xin, Liu, Li, Yang, Yang, Shen, Heng Tao.  2017.  Deep Asymmetric Pairwise Hashing. Proceedings of the 2017 ACM on Multimedia Conference. :1522–1530.
Recently, deep neural networks based hashing methods have greatly improved the multimedia retrieval performance by simultaneously learning feature representations and binary hash functions. Inspired by the latest advance in the asymmetric hashing scheme, in this work, we propose a novel Deep Asymmetric Pairwise Hashing approach (DAPH) for supervised hashing. The core idea is that two deep convolutional models are jointly trained such that their output codes for a pair of images can well reveal the similarity indicated by their semantic labels. A pairwise loss is elaborately designed to preserve the pairwise similarities between images as well as incorporating the independence and balance hash code learning criteria. By taking advantage of the flexibility of asymmetric hash functions, we devise an efficient alternating algorithm to optimize the asymmetric deep hash functions and high-quality binary code jointly. Experiments on three image benchmarks show that DAPH achieves the state-of-the-art performance on large-scale image retrieval.
Xu, Xing, Shen, Fumin, Yang, Yang, Shen, Heng Tao.  2016.  Discriminant Cross-modal Hashing. Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval. :305–308.

Hashing based methods have attracted considerable attention for efficient cross-modal retrieval on large-scale multimedia data. The core problem of cross-modal hashing is how to effectively integrate heterogeneous features from different modalities to learn hash functions using available supervising information, e.g., class labels. Existing hashing based methods generally project heterogeneous features to a common space for hash codes generation, and the supervising information is incrementally used for improving performance. However, these methods may produce ineffective hash codes, due to the failure to explore the discriminative property of supervising information and to effectively bridge the semantic gap between different modalities. To address these challenges, we propose a novel hashing based method in a linear classification framework, in which the proposed method learns modality-specific hash functions for generating unified binary codes, and these binary codes are viewed as representative features for discriminative classification with class labels. An effective optimization algorithm is developed for the proposed method to jointly learn the modality-specific hash function, the unified binary codes and a linear classifier. Extensive experiments on three benchmark datasets highlight the advantage of the proposed method and show that it achieves the state-of-the-art performance.

V
Ouyang, Deqiang, Shao, Jie, Zhang, Yonghui, Yang, Yang, Shen, Heng Tao.  2018.  Video-Based Person Re-Identification via Self-Paced Learning and Deep Reinforcement Learning Framework. Proceedings of the 26th ACM International Conference on Multimedia. :1562–1570.

Person re-identification is an important task in video surveillance, focusing on finding the same person across different cameras. However, most existing methods of video-based person re-identification still have some limitations (e.g., the lack of effective deep learning framework, the robustness of the model, and the same treatment for all video frames) which make them unable to achieve better recognition performance. In this paper, we propose a novel self-paced learning algorithm for video-based person re-identification, which could gradually learn from simple to complex samples for a mature and stable model. Self-paced learning is employed to enhance video-based person re-identification based on deep neural network, so that deep neural network and self-paced learning are unified into one frame. Then, based on the trained self-paced learning, we propose to employ deep reinforcement learning to discard misleading and confounding frames and find the most representative frames from video pairs. With the advantage of deep reinforcement learning, our method can learn strategies to select the optimal frame groups. Experiments show that the proposed framework outperforms the existing methods on the iLIDS-VID, PRID-2011 and MARS datasets.

Z
Yang, Yang, Luo, Yadan, Chen, Weilun, Shen, Fumin, Shao, Jie, Shen, Heng Tao.  2016.  Zero-Shot Hashing via Transferring Supervised Knowledge. Proceedings of the 2016 ACM on Multimedia Conference. :1286–1295.

Hashing has shown its efficiency and effectiveness in facilitating large-scale multimedia applications. Supervised knowledge (\textbackslashemph\e.g.\, semantic labels or pair-wise relationship) associated to data is capable of significantly improving the quality of hash codes and hash functions. However, confronted with the rapid growth of newly-emerging concepts and multimedia data on the Web, existing supervised hashing approaches may easily suffer from the scarcity and validity of supervised information due to the expensive cost of manual labelling. In this paper, we propose a novel hashing scheme, termed \textbackslashemph\zero-shot hashing\ (ZSH), which compresses images of "unseen" categories to binary codes with hash functions learned from limited training data of "seen" categories. Specifically, we project independent data labels (i.e., 0/1-form label vectors) into semantic embedding space, where semantic relationships among all the labels can be precisely characterized and thus seen supervised knowledge can be transferred to unseen classes. Moreover, in order to cope with the semantic shift problem, we rotate the embedded space to more suitably align the embedded semantics with the low-level visual feature space, thereby alleviating the influence of semantic gap. In the meantime, to exert positive effects on learning high-quality hash functions, we further propose to preserve local structural property and discrete nature in binary codes. Besides, we develop an efficient alternating algorithm to solve the ZSH model. Extensive experiments conducted on various real-life datasets show the superior zero-shot image retrieval performance of ZSH as compared to several state-of-the-art hashing methods.