Sentiment Analysis Using XLM-R Transformer and Zero-shot Transfer Learning on Resource-poor Indian Language

Kumar, A ORCID: https://orcid.org/0000-0003-4263-7168 and Albuquerque, VHC (2021) Sentiment Analysis Using XLM-R Transformer and Zero-shot Transfer Learning on Resource-poor Indian Language. ACM Transactions on Asian and Low-Resource Language Information Processing, 20 (5). pp. 1-13. ISSN 2375-4699

Preview

Accepted Version
Available under License In Copyright.
Download (2MB) | Preview

Official URL: https://dl.acm.org/doi/10.1145/3461764

Abstract

Sentiment analysis on social media relies on comprehending the natural language and using a robust machine learning technique that learns multiple layers of representations or features of the data and produces state-of-the-art prediction results. The cultural miscellanies, geographically limited trending topic hash-tags, access to aboriginal language keyboards, and conversational comfort in native language compound the linguistic challenges of sentiment analysis. This research evaluates the performance of cross-lingual contextual word embeddings and zero-shot transfer learning in projecting predictions from resource-rich English to resource-poor Hindi language. The cross-lingual XLM-RoBERTa classification model is trained and fine-tuned using the English language Benchmark SemEval 2017 dataset Task 4 A and subsequently zero-shot transfer learning is used to evaluate the classification model on two Hindi sentence-level sentiment analysis datasets, namely, IITP-Movie and IITP-Product review datasets. The proposed model compares favorably to state-of-the-art approaches and gives an effective solution to sentence-level (tweet-level) analysis of sentiments in a resource-poor scenario. The proposed model compares favorably to state-of-the-art approaches and achieves an average performance accuracy of 60.93 on both the Hindi datasets.

Item Type:	Article
Peer-reviewed:	Yes
Date Deposited:	04 Apr 2022 15:00
Publisher:	ACM
Additional Information:	This is an Author Accepted Manuscript of an article published in ACM Transactions on Asian and Low-Resource Language Information Processing by ACM.
Divisions:	Faculties > Science and Engineering
URI:	https://e-space.mmu.ac.uk/id/eprint/629491
DOI:	https://doi.org/10.1145/3461764
ISSN	2375-4699
e-ISSN	2375-4702

Impact and Reach

Statistics

DownloadsShow export options

Activity Overview

6 month trend

1,389Downloads

6 month trend

173Hits

Additional statistics for this dataset are available via IRStats2.

Altmetric

Repository staff only

Edit record