e-space
Manchester Metropolitan University's Research Repository

Deep context of citations using machine‑learning models in scholarly full‑text articles

Hassan, Saeed-Ul and Imran, Mubashir and Iqbal, Sehrish and Aljohani, Naif Radi and Nawaz, Raheel (2018) Deep context of citations using machine‑learning models in scholarly full‑text articles. Scientometrics, 117 (3). ISSN 0138-9130

[img]
Preview

Download (1MB) | Preview

Abstract

Information retrieval systems for scholarly literature rely heavily not only on text matching but on semantic- and context-based features. Readers nowadays are deeply interested in how important an article is, its purpose and how influential it is in follow-up research work. Numerous techniques to tap the power of machine learning and artificial intelligence have been developed to enhance retrieval of the most influential scientific literature. In this paper, we compare and improve on four existing state-of-the-art techniques designed to identify influential citations. We consider 450 citations from the Association for Computational Linguistics corpus, classified by experts as either important or unimportant, and further extract 64 features based on the methodology of four state-of-the-art techniques. We apply the Extra-Trees classifier to select 29 best features and apply the Random Forest and Support Vector Machine classifiers to all selected techniques. Using the Random Forest classifier, our supervised model improves on the state-of-the-art method by 11.25%, with 89% Precision-Recall area under the curve. Finally, we present our deep-learning model, the Long Short-Term Memory network, that uses all 64 features to distinguish important and unimportant citations with 92.57% accuracy.

Impact and Reach

Statistics

Downloads
Activity Overview
14Downloads
88Hits

Additional statistics for this dataset are available via IRStats2.

Altmetric

Actions (login required)

Edit Item Edit Item