Ali, Adnan, Li, Jinlong, Chen, Huanhuan and Bashir, Ali Kashif ORCID: https://orcid.org/0000-0001-7595-2522 (2022) Temporal pattern mining from user generated content. Digital Communications and Networks, 8 (6). pp. 1027-1039. ISSN 2352-8648
|
Published Version
Available under License Creative Commons Attribution Non-commercial No Derivatives. Download (2MB) | Preview |
Abstract
Faster internet, IoT, and social media have reformed the conventional web into a collaborative web resulting in enormous user-generated content. Several studies are focused on such content; however, they mainly focus on textual data, thus undermining the importance of metadata. Considering this gap, we provide a temporal pattern mining framework to model and utilize user-generated content's metadata. First, we scrap 2.1 million tweets from Twitter between Nov-2020 to Sep-2021 about 100 hashtag keywords and present these tweets into 100 User-Tweet-Hashtag (UTH) dynamic graphs. Second, we extract and identify four time-series in three timespans (Day, Hour, and Minute) from UTH dynamic graphs. Lastly, we model these four time-series with three machine learning algorithms to mine temporal patterns with the accuracy of 95.89%, 93.17%, 90.97%, and 93.73%, respectively. We demonstrate that user-generated content's metadata contains valuable information, which helps to understand the users' collective behavior and can be beneficial for business and research. Dataset and codes are publicly available; the link is given in the dataset section.
Impact and Reach
Statistics
Additional statistics for this dataset are available via IRStats2.