Larner, S (2014) A preliminary investigation into the use of fixed formulaic sequences as a marker of authorship. International Journal of Speech Language and the Law, 21 (1). ISSN 1748-8885
|
Available under License : See the attached licence file. Download (727kB) | Preview |
Abstract
This research unites the theory of formulaic language—prefabricated sequences of words believed to be stored as holistic units—and the practice of forensic authorship attribution with a view to developing a new marker of authorship. It stands to reason that since formulaic sequences are holistically processed as single lexical items, they are likely to elude a writer’s attempts to disguise their style. Furthermore, evidence suggests that individuals have different stores of formulaic sequences. Therefore, research into differences in formulaic language usage may assist in the development of new tools for authorship attribution. In order to test this assertion, a reference list containing 13,412 formulaic sequences was compiled from multiple online sources. This was then used to identify formulaic sequences in a 20 author corpus containing 100 personal narratives. After exploring the types of formulaic sequences used by authors, statistical tests were used to determine whether the count of formulaic words was sufficient to establish variation between authors and to attribute a Questioned Text to its author.
Impact and Reach
Statistics
Additional statistics for this dataset are available via IRStats2.