Unveiling group activity recognition: Leveraging Local–Global Context-Aware Graph Reasoning for enhanced actor–scene interactions

Jiang, X, Qing, L ORCID: https://orcid.org/0000-0003-3555-0005, Huang, J, Guo, L ORCID: https://orcid.org/0000-0003-1272-8480 and Peng, Y (2024) Unveiling group activity recognition: Leveraging Local–Global Context-Aware Graph Reasoning for enhanced actor–scene interactions. Engineering Applications of Artificial Intelligence, 133. 108412. ISSN 0952-1976

Preview

Accepted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.
Download (1MB) | Preview

Official URL: http://dx.doi.org/10.1016/j.engappai.2024.108412

Abstract

Group activity recognition aims to recognize holistic activity in multi-person scene, necessitating consideration of the interactions between actors and their surroundings. It has various applications, such as public surveillance, video analysis. Nonetheless, existing works merely extract scene features as a supplementary component of activity features, failing to adequately explore the interplay between the scene and actors. To address this limitation, this paper proposes a Local–Global Context-Aware Graph Reasoning Model (LG-CAGR), which leverages and reasons through local and global context features to gain deeper insights into group activity within the scene. In particular, we present an innovative feature extraction strategy to harness local location features and global scene attributes, effectively complementing actor features by capturing spatial group topology and determining relative positions between actors. Subsequently, we delve into these features by devising a local–global group reasoning module that deduces pair-wise interactions between actors and scenes within Graph Convolutional Network, comprehensively elucidating correlations between overall scene and local individuals to construct group-level features. Multi-graphs are constructed considering actor's features, scene features as nodes, and interactions as edges. A self-attention graph pooling network is introduced to automatically integrate key actor features and form rich group-level features to recognize group activity. The results on Collective Activity Dataset, Collective Activity Extended Dataset, Volleyball Dataset and Public Life in Public Space dataset have reached 94.0%, 97.7%, 92.7% and 56.1%. Compared with existing methods using the same backbone, we exceeded 1%, 2.1%, 0.3%, and 14.9% respectively, affirming the superiority of the proposed method compared with state-of-the-art methods.

Item Type:	Article (Article)
Peer-reviewed:	Yes
Date Deposited:	10 Dec 2024 16:07
Publisher:	Elsevier
Additional Information:	This is an author accepted manuscript of an article published in Engineering Applications of Artificial Intelligence. by Elsevier.
Divisions:	Organisation > Science and Engineering Organisation > Science and Engineering > Department of Computing and Maths
Subject terms:	08 Information and Computing Sciences, 09 Engineering, Artificial Intelligence & Image Processing, 40 Engineering, 46 Information and computing sciences
Data Access Statement:	The authors do not have permission to share data.
URI:	https://e-space.mmu.ac.uk/id/eprint/637188
DOI:	https://doi.org/10.1016/j.engappai.2024.108412
ISSN	0952-1976

Impact and Reach

Statistics

DownloadsShow export options

Activity Overview

6 month trend

15Downloads

6 month trend

49Hits

Additional statistics for this dataset are available via IRStats2.

Altmetric

Repository staff only

Edit record