Manchester Metropolitan University's Research Repository

    A framework for applying short text semantic similarity in goal-oriented conversational agents

    O'Shea, James ORCID logoORCID: https://orcid.org/0000-0001-5645-2370 (2010) A framework for applying short text semantic similarity in goal-oriented conversational agents. Doctoral thesis (PhD), Manchester Metropolitan University.


    Download (201MB) | Preview


    Existing Conversational Agents (CAs) have several disadvantages. The most serious is that the CAs that humans find most coherent and intelligent are based on the pattern matching technique, which is labour intensive and results in CAs that are difficult to maintain. The main alternative technique, Natural Language Processing, produces CAs which have a high computational complexity and are unlikely to scale well when used by large numbers of people. These limitations have prevented CAs from realising their huge potential in practical applications. This thesis concerns a framework for the development of a new generation of CAs. The key component is Short Text Semantic Similarity (STSS). Replacing pattern matching rules by measurement of the similarity between user utterances and prototype statements results in CAs which are simple to develop and maintain, and are also computationally efficient. STSS algorithms are a recent development and a method is required to evaluate and compare the stream of new emerging algorithms before they are incorporated into CAs. This thesis investigated the development of benchmark datasets for the evaluation of such new algorithms. A second strand of work concerned the development of a new model of STSS, taking account of Dialogue Acts and Valence, two factors which have not been considered in previous models. The benefits and achievements of this work include identification of the best methodology for obtaining ground truth similarity from human raters, the production of two gold standard benchmark datasets for evaluation of STSS measures, the proposal of a factor based model of STSS and the development of a set of computationally efficient classifiers for the question dialogue act.

    Impact and Reach


    Activity Overview
    6 month trend
    6 month trend

    Additional statistics for this dataset are available via IRStats2.

    Repository staff only

    Edit record Edit record