Contour-Guided Context Learning for Scene Text Recognition

Hsieh, WC, Hsu, GS ORCID: https://orcid.org/0000-0003-2631-0448, Chen, JY, Yap, MH ORCID: https://orcid.org/0000-0001-7681-4287 and Chao, ZC (2025) Contour-Guided Context Learning for Scene Text Recognition. In: Pattern Recognition: 27th International Conference, ICPR 2024, Kolkata, India, December 1–5, 2024, Proceedings, Part XX, pp. 103-117. Presented at 27th International Conference, ICPR 2024, 1 December 2024 – 5 December 2024, Kolkata, India.

Accepted Version
File will be available on: 5 December 2025.
Available under License In Copyright.
Download (6MB)

Official URL: https://doi.org/10.1007/978-3-031-78498-9_8

Abstract

We propose contour-guided context learning (CCL) for bilingual scene text recognition (STR). The CCL framework consists of three parts: Contour Guided Transformer (CGT), Contextual Learning Transformer (CLT) and Multimodal Transformer (MMT) for fusion. CGT embeds a CLIP image encoder and utilizes CLIP’s pre-training capabilities to capture contour features from input images, and CLT embeds a CLIP text encoder to correct contextual errors. The fusion network incorporates attention features extracted by Transformer to enhance text recognition performance. Unlike most STR methods that only target English, the proposed CCL is designed to handle both English and Chinese and can handle irregularly shaped scene text. We conduct a comprehensive evaluation on Chinese and English benchmark datasets to validate the performance of our approach against state-of-the-art methods.

Item Type:	Conference or Workshop Item (Paper)
Published Proceedings:	Pattern Recognition: 27th International Conference, ICPR 2024, Kolkata, India, December 1–5, 2024, Proceedings, Part XX
Peer-reviewed:	Yes
Date Deposited:	05 Mar 2025 10:49
Publisher:	Springer
Additional Information:	This version of the conference paper has been accepted for publication, after peer review (when applicable) and is subject to Springer Nature’s AM terms of use (https://www.springernature.com/gp/open-science/policies/accepted-manuscript-terms), but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: http://dx.doi.org/10.1007/978-3-031-78498-9_8
Divisions:	Faculties > Science and Engineering Faculties > Science and Engineering > Department of Computing and Maths
Subject terms:	Artificial Intelligence & Image Processing, 46 Information and computing sciences
URI:	https://e-space.mmu.ac.uk/id/eprint/638681
DOI:	https://doi.org/10.1007/978-3-031-78498-9_8
ISSN	0302-9743
e-ISSN	1611-3349

Impact and Reach

Statistics

DownloadsShow export options

Activity Overview

6 month trend

1Download

6 month trend

67Hits

Additional statistics for this dataset are available via IRStats2.

Altmetric

Repository staff only

Edit record