Efficient and Effective Multi-Modal Queries through Heterogeneous Network Embedding

Abstract

The heterogeneity of today’s Web sources requires information retrieval (IR) systems to handle multi-modal queries. Such queries define a user’s information needs by different data modalities, such as keywords, hashtags, user profiles, and other media. Recent IR systems answer such a multi-modal query by considering it as a set of separate uni-modal queries. However, depending on the chosen operationalisation, such an approach is inefficient or ineffective. It either requires multiple passes over the data or leads to inaccuracies since the relations between data modalities are neglected in the relevance assessment. To mitigate these challenges, we present an IR system that has been designed to answer genuine multi-modal queries. It relies on a heterogeneous network embedding, so that features from diverse modalities can be incorporated when representing both, a query and the data over which it shall be evaluated. By embedding a query and the data in the same vector space, the relations across modalities are made explicit and exploited for more accurate query evaluation. At the same time, multi-modal queries are answered with a single pass over the data. An experimental evaluation using diverse real-world and synthetic datasets illustrates that our approach returns twice the amount of relevant information compared to baseline techniques, while scaling to large multi-modal databases.

Existing System

? In this work, we aim to provide a unified framework to deeply summarize and evaluate existing research on heterogeneous network embedding (HNE), which includes but goes beyond a normal survey. ? Particularly, based on a uniform taxonomy that clearly categorizes and summarizes the existing models (and likely future models), we propose a generic objective function of network smoothness, and reformulate all existing models into this uniform paradigm while highlighting their individual novel contributions. ? We envision this paradigm to be helpful in guiding the development of future novel HNE algorithms, and in the meantime facilitate their conceptual contrast towards existing ones.

Disadvantages

? we consider the problem of identifying tuples that are most relevant for a multi-modal query. ? We evaluated the proposed approach with a set of diverse real-world and synthetic datasets. ? Our techniques turn out to be both efficient, scaling linearly to hundreds of thousands of data elements, and effective, retrieving twice the amount of relevant tuples compared to baseline techniques. ? The remainder of the paper is organized as follows. Next, we motivate and formulate the addressed problem in §2. Then, §3 gives an overview of our approach.

Proposed System

• To capture and exploit such node and link heterogeneity, heterogeneous networks have been proposed and widely used in many real-world network mining scenarios, such as meta-path based similarity search, node classification and clustering , knowledge base completion, and recommendations. • In this work, we stress that one of the most important principles underlying HNE (as well as most other scenarios of network modeling and mining) is homophily. • An incompatibility measure is proposed to select appropriate aspects for embedding learn

Advantages

? Due to the need to pass over the data multiple times, these systems show performance issues. ? Other systems employ early fusion , which computes different representations for the modalities and embeds the multi-modal queries into these vector spaces. ? This dataset is intended to evaluate the efficiency of our approach to network embedding that employs a partitioning scheme, as proposed in §6. ? We evaluate efficiency in terms of the retrieval time needed to answer a multi-modal query, and the training time required to construct embeddings.

Download DOC Download PPT