Manchester Metropolitan University's Research Repository

The False-Positive Rate of Automated Plagiarism Detection for SQL Assessments

Kleerekoper, Anthony ORCID logoORCID: https://orcid.org/0000-0002-3621-8568 and Schofield, Andrew (2019) The False-Positive Rate of Automated Plagiarism Detection for SQL Assessments. In: UK & Ireland Computing Education Research (UKICER), 05 September 2019 - 06 September 2019, Canterbury, England.

Accepted Version
Download (1MB) | Preview


Automated assessment is becoming increasingly common in Computer Science and with it automated plagiarism detection is also common. However, little attention has been paid to SQL assessment where submissions are much shorter and must be less varied than in imperative languages. This brings the challenge of avoiding high false-positive rates that require manual inspection and undermine the usefulness of automated detection. In this paper we investigate the false-positive rate of various automated plagiarism detection algorithms. We find that there is a significant false-positive rate of between 15% and 64%. These results call into question the usefulness of automated detection for SQL since they imply that a lot of manual inspection will still be needed. However, our results suggest that the false-positive rate may be restricted to shorter queries (e.g. under 200 characters). Further research is needed because our datasets consist mostly of short queries and the results for longer queries are based on a small subset of the data.

Impact and Reach


Activity Overview
6 month trend
6 month trend

Additional statistics for this dataset are available via IRStats2.


Actions (login required)

View Item View Item