e-space
Manchester Metropolitan University's Research Repository

    Lexical complexity prediction: an overview

    North, Kai, Zampieri, Marcos and Shardlow, Matthew ORCID logoORCID: https://orcid.org/0000-0003-1129-2750 (2023) Lexical complexity prediction: an overview. ACM Computing Surveys, 55 (9). p. 179. ISSN 0360-0300

    [img]
    Preview
    Accepted Version
    Download (683kB) | Preview

    Abstract

    The occurrence of unknown words in texts significantly hinders reading comprehension. To improve accessibility for specific target populations, computational modeling has been applied to identify complex words in texts and substitute them for simpler alternatives. In this article, we present an overview of computational approaches to lexical complexity prediction focusing on the work carried out on English data. We survey relevant approaches to this problem which include traditional machine learning classifiers (e.g., SVMs, logistic regression) and deep neural networks as well as a variety of features, such as those inspired by literature in psycholinguistics as well as word frequency, word length, and many others. Furthermore, we introduce readers to past competitions and available datasets created on this topic. Finally, we include brief sections on applications of lexical complexity prediction, such as readability and text simplification, together with related studies on languages other than English.

    Impact and Reach

    Statistics

    Activity Overview
    6 month trend
    303Downloads
    6 month trend
    51Hits

    Additional statistics for this dataset are available via IRStats2.

    Altmetric

    Repository staff only

    Edit record Edit record