Tripathi, Ashish Kumar, Sharma, Kapil, Bala, Manju, Kumar, Akshi, Menon, Varun G and Bashir, Ali Kashif ORCID: https://orcid.org/0000-0001-7595-2522 (2021) A Parallel Military Dog based Algorithm for Clustering Big data in Cognitive Industrial Internet of Things. IEEE Transactions on Industrial Informatics, 17 (3). pp. 2134-2142. ISSN 1551-3203
|
Accepted Version
Available under License In Copyright. Download (359kB) | Preview |
Abstract
With the advancement of wireless communication, internet of things, and big data, high performance data analytic tools and algorithms are required. Data clustering, a promising analytic technique is widely used to solve the IoT and big data based problems, since it does not require labeled datasets. Recently, meta-heuristic algorithms have been efficiently used to solve various clustering problems. However, to handle big data sets produced from IoT devices, these algorithm fail to respond within desired time due to high computation cost. This paper presents a new meta-heuristic based clustering method to solve the big data problems by leveraging the strength of MapReduce. The proposed methods leverages the searching potential of military dog squad to find the optimal centroids and MapReduce architecture to handle the big data sets. The optimization efficacy the proposed method is validated against 17 benchmark functions and the results are compared with 5 other recent algorithms namely, bat, particle swarm optimization, artificial bee colony, multiverse optimization, and whale optimization algorithm. Further, a parallel version of the proposed method is introduced using MapReduce (MR-MDBO) for clustering the big datasets produced from industrial IoT. Moreover, the performance of MR-MDBO is studied on 2 benchmark UCI datasets and 3 real IoT based datasets produced from industry. The F-measure and computation time of the MR-MDBO is compared with the 6 other state-of-the-art methods. The experimental results witness that the proposed MR-MDBO based clustering outperforms the other considered algorithms in terms of clustering accuracy and computation times.
Impact and Reach
Statistics
Additional statistics for this dataset are available via IRStats2.