A web knowledge-driven multimodal retrieval method in computational social systems: unsupervised and robust graph convolutional hashing

Duan, Youxiang, Chen, Ning, Bashir, Ali Kashif ORCID: https://orcid.org/0000-0001-7595-2522, Alshehri, Mohammad Dahman, Liu, Lei, Zhang, Peiying and Yu, Keping (2024) A web knowledge-driven multimodal retrieval method in computational social systems: unsupervised and robust graph convolutional hashing. IEEE Transactions on Computational Social Systems, 11 (3). pp. 3146-3156. ISSN 2329-924X

Preview

Accepted Version
Available under License In Copyright.
Download (6MB) | Preview

Official URL: https://doi.org/10.1109/TCSS.2022.3216621

Abstract

Multimodal retrieval has received widespread consideration since it can commendably provide massive related data support for the development of computational social systems (CSSs). However, the existing works still face the following challenges: 1) rely on the tedious manual marking process when extended to CSS, which not only introduces subjective errors but also consumes abundant time and labor costs; 2) only using strongly aligned data for training, lacks concern for the adjacency information, which makes the poor robustness and semantic heterogeneity gap difficult to be effectively fit; and 3) mapping features into real-valued forms, which leads to the characteristics of high storage and low retrieval efficiency. To address these issues in turn, we have designed a multimodal retrieval framework based on web-knowledge-driven, called <italic>unsupervised and robust graph convolutional hashing</italic> (URGCH). The specific implementations are as follows: first, a “<italic>secondary semantic self-fusion</italic>” approach is proposed, which mainly extracts semantic-rich features through pretrained neural networks, constructs the joint semantic matrix through semantic fusion, and eliminates the process of manual marking; second, a “<italic>adaptive computing</italic>” approach is designed to construct enhanced semantic graph features through the knowledge-infused of neighborhoods and uses graph convolutional networks for knowledge fusion coding, which enables URGCH to sufficiently fit the semantic modality gap while obtaining satisfactory robustness features; Third, combined with hash learning, the multimodality data are mapped into the form of binary code, which reduces storage requirements and improves retrieval efficiency. Eventually, we perform plentiful experiments on the web dataset. The results evidence that URGCH exceeds other baselines about <inline-formula> <tex-math notation="LaTeX">$1\%$</tex-math> </inline-formula>–<inline-formula> <tex-math notation="LaTeX">$3.7\%$</tex-math> </inline-formula> in mean average precisions (MAPs), displays superior performance in all the aspects, and can meaningfully provide multimodal data retrieval services to CSS.

Item Type:	Article (Article)
Peer-reviewed:	Yes
Date Deposited:	03 Jan 2023 09:43
Publisher:	Institute of Electrical and Electronics Engineers
Additional Information:	© 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Divisions:	Faculties > Science and Engineering
URI:	https://e-space.mmu.ac.uk/id/eprint/631058
DOI:	https://doi.org/10.1109/TCSS.2022.3216621
ISSN	2329-924X
e-ISSN	2329-924X

Impact and Reach

Statistics

DownloadsShow export options

Activity Overview

6 month trend

491Downloads

6 month trend

109Hits

Additional statistics for this dataset are available via IRStats2.

Altmetric

Repository staff only

Edit record