On the Creation of a Fuzzy Dataset for the Evaluation of Fuzzy Semantic Similarity Measures

Chandran, D, Crockett, K and Mclean, D (2014) On the Creation of a Fuzzy Dataset for the Evaluation of Fuzzy Semantic Similarity Measures. In: 2014 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

Preview

Available under License In Copyright.
Download (281kB) | Preview

Abstract

Short text semantic similarity (STSS) measures are algorithms designed to compare short texts and return a level of similarity between them. However, until recently such measures have ignored perception or fuzzy based words (i.e. very hot, cold less cold) in calculations of both word and sentence similarity. Evaluation of such measures is usually achieved through the use of benchmark data sets comprising of a set of rigorously collected sentence pairs which have been evaluated by human participants. A weakness of these datasets is that the sentences pairs include limited, if any, fuzzy based words that makes them impractical for evaluating fuzzy sentence similarity measures. In this paper, a method is presented for the creation of a new benchmark dataset known as SFWD (Single Fuzzy Word Dataset). After creation the data set is then used in the evaluation of FAST, an ontology based fuzzy algorithm for semantic similarity testing that uses concepts of fuzzy and computing with words to allow for the accurate representation of fuzzy based words. The SFWD is then used to undertake a comparative analysis of other established STSS measures.

Item Type:	Conference or Workshop Item
Peer-reviewed:	No
Date Deposited:	18 May 2016 11:56
Publisher:	IEEE
Additional Information:	This is an Author Final Copy of a paper published in the 2014 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), published by and copyright IEEE.
Divisions:	Faculties > Science and Engineering
URI:	https://e-space.mmu.ac.uk/id/eprint/609599
ISSN	1098-7584

Impact and Reach

Statistics

DownloadsShow export options

Activity Overview

6 month trend

356Downloads

6 month trend

608Hits

Additional statistics for this dataset are available via IRStats2.

Altmetric

Repository staff only

Edit record