e-space
Manchester Metropolitan University's Research Repository

    A Nature-Inspired Partial Distance-Based Clustering Algorithm

    El Habib Kahla, Mohammed, Beggas, Mounir, Laouid, Abdelkader ORCID logoORCID: https://orcid.org/0000-0002-8175-8467 and Hammoudeh, Mohammad ORCID logoORCID: https://orcid.org/0000-0003-1058-0996 (2024) A Nature-Inspired Partial Distance-Based Clustering Algorithm. Journal of Sensor and Actuator Networks, 13 (4). 36.

    [img]
    Preview
    Published Version
    Available under License Creative Commons Attribution.

    Download (1MB) | Preview

    Abstract

    In the rapidly advancing landscape of digital technologies, clustering plays a critical role in the domains of artificial intelligence and big data. Clustering is essential for extracting meaningful insights and patterns from large, intricate datasets. Despite the efficacy of traditional clustering techniques in handling diverse data types and sizes, they encounter challenges posed by the increasing volume and dimensionality of data, as well as the complex structures inherent in high-dimensional spaces. This research recognizes the constraints of conventional clustering methods, including sensitivity to initial centroids, dependence on prior knowledge of cluster counts, and scalability issues, particularly in large datasets and Internet of Things implementations. In response to these challenges, we propose a K-level clustering algorithm inspired by the collective behavior of fish locomotion. K-level introduces a novel clustering approach based on greedy merging driven by distances in stages. This iterative process efficiently establishes hierarchical structures without the need for exhaustive computations. K-level gives users enhanced control over computational complexity, enabling them to specify the number of clusters merged simultaneously. This flexibility ensures accurate and efficient hierarchical clustering across diverse data types, offering a scalable solution for processing extensive datasets within a reasonable timeframe. The internal validation metrics, including the Silhouette Score, Davies–Bouldin Index, and Calinski–Harabasz Index, are utilized to evaluate the K-level algorithm across various types of datasets. Additionally, comparisons are made with rivals in the literature, including UPGMA, CLINK, UPGMC, SLINK, and K-means. The experiments and analyses show that the proposed algorithm overcomes many of the limitations of existing clustering methods, presenting scalable and adaptable clustering in the dynamic landscape of evolving data challenges.

    Impact and Reach

    Statistics

    Activity Overview
    6 month trend
    4Downloads
    6 month trend
    10Hits

    Additional statistics for this dataset are available via IRStats2.

    Altmetric

    Repository staff only

    Edit record Edit record