e-space
Manchester Metropolitan University's Research Repository

Extracting multiword expressions with a semantic tagger

Piao, S and Rayson, P and Archer, DE and Wilson, A and McEnery, T (2003) Extracting multiword expressions with a semantic tagger. In: ACL 2003, 41st Annual Meeting of the Association for Computational Linguistics, 07 July 2003 - 12 July 2003, Sapporo, Japan.

[img]
Preview

Download (354kB) | Preview

Abstract

Automatic extraction of multiword expressions (MWE) presents a tough challenge for the NLP community and corpus linguistics. Although various statistically driven or knowl-edge-based approaches have been proposed and tested, efficient MWE extraction still remains an unsolved issue. In this paper, we present our research work in which we tested approaching the MWE issue using a semantic field annotator. We use an English semantic tagger (USAS) de-veloped at Lancaster University to identify multiword units which de-pict single semantic concepts. The Meter Corpus (Gaizauskas et al., 2001; Clough et al., 2002) built in Sheffield was used to evaluate our approach. In our evaluation, this ap-proach extracted a total of 4,195 MWE candidates, of which, after manual checking, 3,792 were ac-cepted as valid MWEs, producing a precision of 90.39% and an esti-mated recall of 39.38%. Of the ac-cepted MWEs, 68.22% or 2,587 are low frequency terms, occurring only once or twice in the corpus. These results show that our approach pro-vides a practical solution to MWE extraction.

Impact and Reach

Statistics

Downloads
Activity Overview
29Downloads
4Hits

Additional statistics for this dataset are available via IRStats2.

Actions (login required)

Edit Item Edit Item