Manchester Metropolitan University's Research Repository

    A web knowledge-driven multimodal retrieval method in computational social systems: unsupervised and robust graph convolutional hashing

    Duan, Youxiang, Chen, Ning, Bashir, Ali Kashif ORCID logoORCID: https://orcid.org/0000-0001-7595-2522, Alshehri, Mohammad Dahman, Liu, Lei, Zhang, Peiying and Yu, Keping (2022) A web knowledge-driven multimodal retrieval method in computational social systems: unsupervised and robust graph convolutional hashing. IEEE Transactions on Computational Social Systems. ISSN 2329-924X

    Accepted Version
    Download (6MB) | Preview


    Multimodal retrieval has received widespread consideration since it can commendably provide massive related data support for the development of computational social systems (CSSs). However, the existing works still face the following challenges: 1) rely on the tedious manual marking process when extended to CSS, which not only introduces subjective errors but also consumes abundant time and labor costs; 2) only using strongly aligned data for training, lacks concern for the adjacency information, which makes the poor robustness and semantic heterogeneity gap difficult to be effectively fit; and 3) mapping features into real-valued forms, which leads to the characteristics of high storage and low retrieval efficiency. To address these issues in turn, we have designed a multimodal retrieval framework based on web-knowledge-driven, called <italic>unsupervised and robust graph convolutional hashing</italic> (URGCH). The specific implementations are as follows: first, a &#x201C;<italic>secondary semantic self-fusion</italic>&#x201D; approach is proposed, which mainly extracts semantic-rich features through pretrained neural networks, constructs the joint semantic matrix through semantic fusion, and eliminates the process of manual marking; second, a &#x201C;<italic>adaptive computing</italic>&#x201D; approach is designed to construct enhanced semantic graph features through the knowledge-infused of neighborhoods and uses graph convolutional networks for knowledge fusion coding, which enables URGCH to sufficiently fit the semantic modality gap while obtaining satisfactory robustness features; Third, combined with hash learning, the multimodality data are mapped into the form of binary code, which reduces storage requirements and improves retrieval efficiency. Eventually, we perform plentiful experiments on the web dataset. The results evidence that URGCH exceeds other baselines about <inline-formula> <tex-math notation="LaTeX">$1\%$</tex-math> </inline-formula>&#x2013;<inline-formula> <tex-math notation="LaTeX">$3.7\%$</tex-math> </inline-formula> in mean average precisions (MAPs), displays superior performance in all the aspects, and can meaningfully provide multimodal data retrieval services to CSS.

    Impact and Reach


    Activity Overview
    6 month trend
    6 month trend

    Additional statistics for this dataset are available via IRStats2.


    Repository staff only

    Edit record Edit record