e-space
Manchester Metropolitan University's Research Repository

    UPON: Urdu Poetry Generation Using Deep Learning: A Novel Approach and Evaluation

    Tabassam, Muhammad Rauf ORCID logoORCID: https://orcid.org/0009-0007-7423-3033, Waheed, Hajra ORCID logoORCID: https://orcid.org/0000-0003-0168-0063, Safder, Iqra ORCID logoORCID: https://orcid.org/0000-0001-9818-4693, Sarwar, Raheem ORCID logoORCID: https://orcid.org/0000-0002-0640-807X, Aljohani, Naif Radi ORCID logoORCID: https://orcid.org/0000-0001-9153-1293, Nawaz, Raheel ORCID logoORCID: https://orcid.org/0000-0001-9588-0052, Hassan, Saeed-Ul ORCID logoORCID: https://orcid.org/0000-0002-6509-9190, Zaman, Farooq ORCID logoORCID: https://orcid.org/0000-0002-9861-4013 and Ahsan, Ahtazaz ORCID logoORCID: https://orcid.org/0000-0001-7772-5462 (2024) UPON: Urdu Poetry Generation Using Deep Learning: A Novel Approach and Evaluation. ACM Transactions on Asian and Low-Resource Language Information Processing. ISSN 2375-4702

    [img]
    Preview
    Accepted Version
    Available under License Creative Commons Attribution.

    Download (631kB) | Preview

    Abstract

    Poetry represents the oldest and most esteemed literary form, allowing poets to convey ideas while carefully attending to elements such as meaning, coherence, poetic quality, and fluency. Notably, the creation of good poetry entails considerations of rhyme and meter. With the advent of artificial intelligence (AI), significant advancements have been made in automatic text generation, primarily within languages such as English and Chinese. However, the generation of Urdu poetry presents a unique challenge due to the language’s inherent ambiguity, cultural and historical nuances, and the demand for creativity. The existing body of literature has only marginally explored Urdu prose and has almost entirely overlooked the domain of Urdu poetry generation, primarily due to the scarcity of comprehensive training data. In response to this deficiency, this research endeavor addresses this challenge. It begins by introducing a specialized Urdu poetry dataset adhering to a specific meter, ’behr-e-khafeef,’ which incorporates approximately 20,000 couplets from the Rekhta repository. Subsequently, a character-based encoding methodology is proposed to transform these couplets into a numerical representation, assigning a distinct identifier to each character. The generation process initiates with the creation of the first verse through a character-level LSTM, followed by the application of a machine translation technique, specifically sequence-to-sequence learning, to formulate the second verse based on the first. The generated poetry is subjected to evaluation based on metrics, including BLEU scores. Additionally, an expert panel of Urdu poets is engaged to conduct a human assessment of the generated couplets, with the evaluation encompassing critical dimensions such as meaning, coherence, poetic quality, and fluency. Our findings are juxtaposed with existing poetry generation systems, demonstrating a notable advancement in the state-of-the-art, as evidenced by a BLEU score of 0.23. The research culminates with the presentation of prospective avenues for further exploration, aimed at inspiring the scholarly community to enhance the domain of poetry generation and augment existing contributions in this field.

    Impact and Reach

    Statistics

    Activity Overview
    6 month trend
    0Downloads
    6 month trend
    6Hits

    Additional statistics for this dataset are available via IRStats2.

    Altmetric

    Repository staff only

    Edit record Edit record