Williams, Ashley ORCID: https://orcid.org/0000-0002-6888-0521 and Shardlow, Matthew (2022) Extending a corpus for assessing the credibility of software practitioner blog articles using meta-knowledge. In: EASE 2022: The International Conference on Evaluation and Assessment in Software Engineering, 13 June 2022 - 15 June 2022, Gothenburg, Sweden.
|
Accepted Version
Available under License In Copyright. Download (439kB) | Preview |
Abstract
Practitioner written grey literature, such as blog articles, has value in software engineering research. Such articles provide insight into practice that is often not visible to research. However, a high quantity and varying quality are two major challenges in utilising such material. Quality is defined as an aggregate of a document's relevance to the consumer and its credibility. Credibility is often assessed through a series of conceptual criteria that are specific to a particular user group. For researchers, previous work has found argumentation' and >evidence' to be two important criteria. In this paper, we extend a previously developed corpus by annotating at broader granularity. We then investigate whether the original annotations (sentence level) can infer these new annotations (article level). Our preliminary results show that sentence-level annotations infer the overall credibility of an article with an F1 score of 91%. These results indicate that the corpus can help future studies in detecting the credibility of practitioner written grey literature.
Impact and Reach
Statistics
Additional statistics for this dataset are available via IRStats2.