Hybrid context enriched deep learning model for fine-grained sentiment analysis in textual and visual semiotic modality social data

Kumar, A ORCID: https://orcid.org/0000-0003-4263-7168, Srinivasan, K, Cheng, WH and Zomaya, AY (2020) Hybrid context enriched deep learning model for fine-grained sentiment analysis in textual and visual semiotic modality social data. Information Processing and Management, 57 (1). p. 102141. ISSN 0306-4573

Preview

Accepted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.
Download (2MB) | Preview

Official URL: https://www.sciencedirect.com/science/article/pii/...

Abstract

Detecting sentiments in natural language is tricky even for humans, making its automated detection more complicated. This research proffers a hybrid deep learning model for fine-grained sentiment prediction in real-time multimodal data. It reinforces the strengths of deep learning nets in combination to machine learning to deal with two specific semiotic systems, namely the textual (written text) and visual (still images) and their combination within the online content using decision level multimodal fusion. The proposed contextual ConvNet-SVMBoVW model, has four modules, namely, the discretization, text analytics, image analytics, and decision module. The input to the model is multimodal text, m ε {text, image, info-graphic}. The discretization module uses Google Lens to separate the text from the image, which is then processed as discrete entities and sent to the respective text analytics and image analytics modules. Text analytics module determines the sentiment using a hybrid of a convolution neural network (ConvNet) enriched with the contextual semantics of SentiCircle. An aggregation scheme is introduced to compute the hybrid polarity. A support vector machine (SVM) classifier trained using bag-of-visual-words (BoVW) for predicting the visual content sentiment. A Boolean decision module with a logical OR operation is augmented to the architecture which validates and categorizes the output on the basis of five fine-grained sentiment categories (truth values), namely ‘highly positive,’ ‘positive,’ ‘neutral,’ ‘negative’ and ‘highly negative.’ The accuracy achieved by the proposed model is nearly 91% which is an improvement over the accuracy obtained by the text and image modules individually.

Item Type:	Article
Peer-reviewed:	Yes
Date Deposited:	29 Apr 2022 08:22
Publisher:	Elsevier
Additional Information:	This is an Accepted Manuscript of an article which appeared in Information Processing and Management, published by Elsevier
Divisions:	Faculties > Science and Engineering
Subject terms:	Science & Technology, Technology, Computer Science, Information Systems, Information Science & Library Science, Computer Science, Multimodal, Sentiment analysis, Deep learning, Context, BoVW, EMOTION RECOGNITION, STRENGTH DETECTION, AUDIO, 0804 Data Format, 0806 Information Systems, 0807 Library and Information Studies, Information & Library Sciences
URI:	https://e-space.mmu.ac.uk/id/eprint/629630
DOI:	https://doi.org/10.1016/j.ipm.2019.102141
ISSN	0306-4573

Impact and Reach

Statistics

DownloadsShow export options

Activity Overview

6 month trend

1,364Downloads

6 month trend

158Hits

Additional statistics for this dataset are available via IRStats2.

Altmetric

Repository staff only

Edit record