Fast Cross-Modal Hashing With Global and Local Similarity Embedding
ABSTARCT :
Recently, supervised cross-modal hashing has attracted much attention and achieved promising performance. To learn hash functions and binary codes, most methods globally exploit the supervised information, for example, preserving an at-least-one pairwise similarity into hash codes or reconstructing the label matrix with binary codes. However, due to the hardness of the discrete optimization problem, they are usually time consuming on large-scale datasets. In addition, they neglect the class correlation in supervised information. From another point of view, they only explore the global similarity of data but overlook the local similarity hidden in the data distribution. To address these issues, we present an efficient supervised cross-modal hashing method, that is, fast cross-modal hashing (FCMH). It leverages not only global similarity information but also the local similarity in a group. Specifically, training samples are partitioned into groups; thereafter, the local similarity in each group is extracted. Moreover, the class correlation in labels is also exploited and embedded into the learning of binary codes. In addition, to solve the discrete optimization problem, we further propose an efficient discrete optimization algorithm with a well-designed group updating scheme, making its computational complexity linear to the size of the training set. In light of this, it is more efficient and scalable to large-scale datasets. Extensive experiments on three benchmark datasets demonstrate that FCMH outperforms some state-of-the-art cross-modal hashing approaches in terms of both retrieval accuracy and learning efficiency.
EXISTING SYSTEM :
? Most existing real-valued cross-modal retrieval techniques are based on the brute-force linear search, which is timeconsuming for large scale data.
? Existing hashing methods can be categorized into uni-modal hashing, multi-view hashing and cross-modal hashing.
? The hash functions learned by most existing cross-modal hashing methods are linear. To capture more complex structure of the multimodal data, nonlinear hash function learning is studied recently.
? Unlike most existing cross-modality similarity learning approaches, the hashing functions are not limited to linear projections.
DISADVANTAGE :
? In this paper, we propose a novel unsupervised hashing learning method to cope with this open problem to directly preserve the manifold structure by hashing.
? Unsupervised multimodal hashing generally needs to solve two basic problems: how to preserve the geometric structure among data points by hash codes and how to simultaneously select discriminative features for multiple modalities.
? Although existing unsupervised hashing methods have been developed, but above problems are not well addressed simultaneously.
? Fortunately, unsupervised cross-modal hashing methods can handle effectively the problem.
PROPOSED SYSTEM :
• To speed up the cross-modal retrieval, a number of binary representation learning methods are proposed to map different modalities of data into a common Hamming space.
• The proposed method is supervised, and the correlation between two modalities can be built according to their shared ground truth probability vectors.
• Furthermore, a sequential learning method (SCM-Seq) is proposed to learn the hash functions bit by bit without imposing the orthogonality constraints.
• Accordingly, a new iterative algorithm is proposed to solve the modified objective function and the proof of its convergence is given.
ADVANTAGE :
? Our model jointly performs the multi-modal graph embedding and discriminative features learning, which further improves the performance.
? Semi-supervised NMF (CPSNMF) uses a constraint propagation approach to get more supervised information, which can greatly improve the retrieval performance.
? In spite that supervised hashing methods have achieved promising performance, they overly depend on massive labeled data.
? Our method outperforms all comparison methods in terms of the average performance for two retrieval tasks on three datasets. With the increasing of hash code length, the retrieval performance on the Task 1 and Task 2 is further improved.
|