Reporting quality of studies using machine learning models for medical diagnosis: a systematic review

Yusuf, Mohamed ORCID: https://orcid.org/0000-0002-9339-4613, Atal, Ignacio, Li, Jacques, Smith, Philip ORCID: https://orcid.org/0000-0001-7719-6951, Ravaud, Philippe, Fergie, Martin, Callaghan, Michael and Selfe, James (2020) Reporting quality of studies using machine learning models for medical diagnosis: a systematic review. BMJ Open, 10 (3). e034568-e034568. ISSN 2044-6055

Preview

Published Version
Available under License Creative Commons Attribution Non-commercial.
Download (414kB) | Preview

Official URL: https://bmjopen.bmj.com/content/10/3/e034568

Abstract

Aims: We conducted a systematic review assessing the reporting quality of studies validating models based on machine learning (ML) for clinical diagnosis, with a specific focus on the reporting of information concerning the participants on which the diagnostic task was evaluated on. Method: Medline Core Clinical Journals were searched for studies published between July 2015 and July 2018. Two reviewers independently screened the retrieved articles, a third reviewer resolved any discrepancies. An extraction list was developed from the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis guideline. Two reviewers independently extracted the data from the eligible articles. Third and fourth reviewers checked, verified the extracted data as well as resolved any discrepancies between the reviewers Results: The search results yielded 161 papers, of which 28 conformed to the eligibility criteria. Detail of data source was reported in 24 of the 28 papers. For all of the papers, the set of patients on which the ML-based diagnostic system was evaluated was partitioned from a larger dataset, and the method for deriving such set was always reported. Information on the diagnostic/non-diagnostic classification was reported well (23/28). The least reported items were the use of reporting guideline (0/28), distribution of disease severity (8/28 patient flow diagram (10/28) and distribution of alternative diagnosis (10/28). A large proportion of studies (23/28) had a delay between the conduct of the reference standard and ML tests, while one study did not and four studies were unclear. For 15 studies, it was unclear whether the evaluation group corresponded to the setting in which the ML test will be applied to. Conclusion: All studies in this review failed to use reporting guidelines, and a large proportion of them lacked adequate detail on participants, making it difficult to replicate, assess and interpret study findings.

Item Type:	Article
Peer-reviewed:	Yes
Date Deposited:	07 Apr 2020 09:58
Publisher:	BMJ
Additional Information:	This is an Open Access article published in BMJ Open, published by BMJ, copyright The Author(s).
Divisions:	Organisation > Health and Education
Subject terms:	1103 Clinical Sciences, 1117 Public Health and Health Services, 1199 Other Medical and Health Sciences
URI:	https://e-space.mmu.ac.uk/id/eprint/625504
DOI:	https://doi.org/10.1136/bmjopen-2019-034568
ISSN	2044-6055
e-ISSN	2044-6055

Impact and Reach

Statistics

DownloadsShow export options

Activity Overview

6 month trend

488Downloads

6 month trend

339Hits

Additional statistics for this dataset are available via IRStats2.

Altmetric

Repository staff only

Edit record