Hybrid multi-label classification model for medical applications based on adaptive synthetic data and ensemble learning

Priyadharshini, M, Banu, A Faritha, Sharma, Bhisham, Chowdhury, Subrata, Rabie, Khaled ORCID: https://orcid.org/0000-0002-9784-3703 and Shongwe, Thokozani (2023) Hybrid multi-label classification model for medical applications based on adaptive synthetic data and ensemble learning. Sensors, 23 (15). 6836. ISSN 1424-8220

Preview

Published Version
Available under License Creative Commons Attribution.
Download (1MB) | Preview

Official URL: https://doi.org/10.3390/s23156836

Abstract

In recent years, both machine learning and computer vision have seen growth in the use of multi-label categorization. SMOTE is now being utilized in existing research for data balance, and SMOTE does not consider that nearby examples may be from different classes when producing synthetic samples. As a result, there can be more class overlap and more noise. To avoid this problem, this work presented an innovative technique called Adaptive Synthetic Data-Based Multi-label Classification (ASDMLC). Adaptive Synthetic (ADASYN) sampling is a sampling strategy for learning from unbalanced data sets. ADASYN weights minority class instances by learning difficulty. For hard-to-learn minority class cases, synthetic data are created. Their numerical variables are normalized with the help of the Min-Max technique to standardize the magnitude of each variable’s impact on the outcomes. The values of the attribute in this work are changed to a new range, from 0 to 1, using the normalization approach. To raise the accuracy of multi-label classification, Velocity-Equalized Particle Swarm Optimization (VPSO) is utilized for feature selection. In the proposed approach, to overcome the premature convergence problem, standard PSO has been improved by equalizing the velocity with each dimension of the problem. To expose the inherent label dependencies, the multi-label classification ensemble of Adaptive Neuro-Fuzzy Inference System (ANFIS), Probabilistic Neural Network (PNN), and Clustering-Based Decision tree methods will be processed based on an averaging method. The following criteria, including precision, recall, accuracy, and error rate, are used to assess performance. The suggested model’s multi-label classification accuracy is 90.88%, better than previous techniques, which is PCT, HOMER, and ML-Forest is 65.57%, 70.66%, and 82.29%, respectively.

Item Type:	Article (Article)
Peer-reviewed:	Yes
Date Deposited:	01 Nov 2023 12:08
Publisher:	MDPI
Additional Information:	This is an open access article which originally appeared in Sesnors, published by MDPI
Divisions:	Organisation > Science and Engineering
Subject terms:	adaptive neuro-fuzzy inference system, adaptive synthetic data, imbalanced data, improved particle swarm optimization, multi-class classification, probabilistic neural network, 0301 Analytical Chemistry, 0502 Environmental Science and Management, 0602 Ecology, 0805 Distributed Computing, 0906 Electrical and Electronic Engineering, Analytical Chemistry
Data Access Statement:	There are no available data.
URI:	https://e-space.mmu.ac.uk/id/eprint/632919
DOI:	https://doi.org/10.3390/s23156836
ISSN	1424-8220
e-ISSN	1424-8220

Impact and Reach

Statistics

DownloadsShow export options

Activity Overview

6 month trend

184Downloads

6 month trend

89Hits

Additional statistics for this dataset are available via IRStats2.

Altmetric

Repository staff only

Edit record