Deep-LIFT Deep Label-Specific Feature Learning for Image Annotation

      

ABSTARCT :

Image annotation aims to jointly predict multiple tags for an image. Although significant progress has been achieved, existing approaches usually overlook aligning specific labels and their corresponding regions due to the weak supervised information (i.e., ``bag of labels'' for regions), thus failing to explicitly exploit the discrimination from different classes. In this article, we propose the deep label-specific feature (Deep-LIFT) learning model to build the explicit and exact correspondence between the label and the local visual region, which improves the effectiveness of feature learning and enhances the interpretability of the model itself. Deep-LIFT extracts features for each label by aligning each label and its region. Specifically, Deep-LIFTs are achieved through learning multiple correlation maps between image convolutional features and label embeddings. Moreover, we construct two variant graph convolutional networks (GCNs) to further capture the interdependency among labels. Empirical studies on benchmark datasets validate that the proposed model achieves superior performance on multilabel classification over other existing state-of-the-art methods.

EXISTING SYSTEM :

? Existing methods on image annotation focus on solving the multi-label classification problem. ? However, one fundamental limitation of these existing approaches is that the original training labels are orderless whilst the RNN model requires a certain output label order for training. ? Although multi-scale representation learning has never been attempted for image annotations, there are existing efforts on designing CNN architectures that enable multi-scale feature fusion. ? Compared to existing models, the main difference is the multi-scale feature learning architecture designed to extract and fuse features at different layers suitable for representing visual concepts of different levels of abstraction.

DISADVANTAGE :

? Multi-label recognition problem can be transformed into a set of binary classification tasks and equipped with powerful feature representations learned with deep Convolutional Neural Networks (CNNs) from raw images. ? Compared to predicting one single class label for an image, multi-label annotation problem is more difficult due to the combinatorial nature of the output label space. ? These models ignore the variable label number problem and simply predict the top k most probable labels per image. ? These solutions treat the image annotation problem as an image to text translation problem and solve it using an encoder-decoder model.

PROPOSED SYSTEM :

• A novel multi scale deep CNN architecture is proposed which is capable of effectively extracting and fusing features at different scales corresponding to visual concepts of different levels of abstraction. • A multi-scale CNN sub network is proposed to extract visual feature from raw image pixels, and a multi-layer perception sub network is applied to extract textual features from noisy user-provided tags. • We have proposed a novel multi-modal multi-scale deep learning model for large-scale image annotation. • Extensive experiments are carried out to demonstrate that the proposed model outperforms the state-ofthe-art methods.

ADVANTAGE :

? These metrics evaluate the performance of multi-label predictor from diverse aspects. ? Moreover, attention mechanism has been proven to be beneficial for improving the performance of multi-label classification. ? Our Deep-LIFT is able to simultaneously learn label specific feature representation and complex label correlations, leading to promising performance improvement in image annotation. ? Capturing the correlations among different labels can improve the multi-label image annotation performance, which has been well recognized. ? We try our best to tune the parameters of all the above compared methods to obtain the best performance according to the suggested ways in their literatures.

Download DOC Download PPT

We have more than 145000 Documents , PPT and Research Papers

Have a question ?

Chat on WhatsApp