Sarwar, Raheem ORCID: https://orcid.org/0000-0002-0640-807X, Ahmad, Bilal, Teh, Pin Shen ORCID: https://orcid.org/0000-0002-0607-2617, Tuarob, Suppawong ORCID: https://orcid.org/0000-0002-5201-5699, Thaipisutikul, Tipajin ORCID: https://orcid.org/0000-0002-2538-1108, Zaman, Farooq, Aljohani, Naif R ORCID: https://orcid.org/0000-0001-9153-1293, Zhu, Jia, Hassan, Saeed-Ul ORCID: https://orcid.org/0000-0002-6509-9190, Nawaz, Raheel ORCID: https://orcid.org/0000-0001-9588-0052, Ansari, Ali R ORCID: https://orcid.org/0000-0001-5090-7813 and Fayyaz, Muhammad AB ORCID: https://orcid.org/0000-0002-1794-3000 (2024) HybridEval: An Improved Novel Hybrid Metric for Evaluation of Text Summarization. Journal of Informatics and Web Engineering, 3 (3). pp. 233-255.
|
Published Version
Available under License Creative Commons Attribution Non-commercial No Derivatives. Download (1MB) | Preview |
Abstract
The present work re-evaluates the evaluation method for text summarization tasks. Two state-of-the-art assessment measures e.g., Recall-Oriented Understudy for Gisting Evaluation (ROUGE) and Bilingual Evaluation Understudy (BLEU) are discussed along with their limitations before presenting a novel evaluation metric. The evaluation scores are significantly different because of the length and vocabulary of the sentences, this suggests that the primary restriction is its inability to preserve the semantics and meaning of the sentences and consistent weight distribution over the whole sentence. To address this, the present work organizes the phrases into six different groups and to evaluate “text summarization” problems, a new hybrid approach (HybridEval) is proposed. Our approach uses a weighted sum of cosine scores from InferSent’s SentEval algorithms combined with original scores, achieving high accuracy. HybridEval outperforms existing state-of-the-art models by 10-15% in evaluation scores.
Impact and Reach
Statistics
Additional statistics for this dataset are available via IRStats2.