e-space
Manchester Metropolitan University's Research Repository

    One emoji, many meanings: A corpus for the prediction and disambiguation of emoji sense

    Shardlow, Matthew ORCID logoORCID: https://orcid.org/0000-0003-1129-2750, Gerber, Luciano ORCID logoORCID: https://orcid.org/0000-0002-8423-4642 and Nawaz, Raheel ORCID logoORCID: https://orcid.org/0000-0001-9588-0052 (2022) One emoji, many meanings: A corpus for the prediction and disambiguation of emoji sense. Expert Systems with Applications, 198. ISSN 0957-4174

    [img]
    Preview
    Published Version
    Available under License Creative Commons Attribution.

    Download (789kB) | Preview

    Abstract

    In this work, we uncover a hidden linguistic property of emoji, namely that they are polysemous and can be used to form a semantic network of emoji meanings. Our key contributions to this direction of study are as follows: (1) We have developed a new corpus to help in the task of emoji sense prediction. This corpus contains tweets with single emojis, where each emoji has been labelled with an appropriate sense identifier from WordNet. (2) Experiments, which demonstrate that it is possible to predict the sense of an emoji using our corpus to a reasonable level of accuracy. We are able to report an average path-similarity score of 0.4146 for our best emoji sense prediction algorithm. (3) We further show that emoji sense is a useful feature in the emoji prediction task, where we report an accuracy of 58.8816 and macro-F1 score of 46.6640, beating reasonable baselines in this task. Our work demonstrates that importance of considering the meaning behind emoji, rather than ignoring them, or simply treating them as extra wordforms.

    Impact and Reach

    Statistics

    Activity Overview
    6 month trend
    410Downloads
    6 month trend
    146Hits

    Additional statistics for this dataset are available via IRStats2.

    Altmetric

    Repository staff only

    Edit record Edit record