Cluster Analysis of Twitter Data: A Review of Algorithms

Crockett, KA, Mclean, D, Latham, A and Alnajran, N (2017) Cluster Analysis of Twitter Data: A Review of Algorithms. In: Proceedings of the 9th International Conference on Agents and Artificial Intelligence, pp. 239-249. Presented at 9th International Conference on Agents and Artificial Intelligence (ICAART), 24 February 2017 - 26 February 2017, Portugal.

Preview

Available under License : See the attached licence file.
Download (235kB) | Preview

Official URL: http://www.icaart.org/

Abstract

Twitter, a microblogging online social network (OSN), has quickly gained prominence as it provides people with the opportunity to communicate and share posts and topics. Tremendous value lies in automated analysing and reasoning about such data in order to derive meaningful insights, which carries potential opportunities for businesses, users, and consumers. However, the sheer volume, noise, and dynamism of Twitter, imposes challenges that hinder the efficacy of observing clusters with high intra-cluster (i.e. minimum variance) and low inter-cluster similarities. This review focuses on research that has used various clustering algorithms to analyse Twitter data streams and identify hidden patterns in tweets where text is highly unstructured. This paper performs a comparative analysis on approaches of unsupervised learning in order to determine whether empirical findings support the enhancement of decision support and pattern recognition applications. A review of the literature identified 13 studies that implemented different clustering methods. A comparison including clustering methods, algorithms, number of clusters, dataset(s) size, distance measure, clustering features, evaluation methods, and results was conducted. The conclusion reports that the use of unsupervised learning in mining social media data has several weaknesses. Success criteria and future directions for research and practice to the research community are discussed.

Item Type:	Conference or Workshop Item (Paper)
Published Proceedings:	Proceedings of the 9th International Conference on Agents and Artificial Intelligence
Volume:	2
Peer-reviewed:	Yes
Date Deposited:	14 Mar 2017 09:32
Publisher:	Science and Technology Publications (SCITEPRESS)/Springer Books
Additional Information:	This is an Author Accepted Manuscript of a paper accepted for publication in Proceedings of the 9th International Conference on Agents and Artificial Intelligence, copyright SCITEPRESS – Science and Technology Publications.
Divisions:	Faculties > Science and Engineering
Subject terms:	Twitter, Social Network Analysis, Data Mining, Machine Learning
URI:	https://e-space.mmu.ac.uk/id/eprint/617901
DOI:	https://doi.org/10.5220/0006202802390249

Impact and Reach

Statistics

DownloadsShow export options

Activity Overview

6 month trend

4,115Downloads

6 month trend

2,148Hits

Additional statistics for this dataset are available via IRStats2.

Altmetric

Repository staff only

Edit record