e-space
Manchester Metropolitan University's Research Repository

    Foundations of Expected Points in Rugby Union: A Methodological Approach

    Martinez-Arastey, Guillermo, Datson, Naomi ORCID logoORCID: https://orcid.org/0000-0002-5507-9540, Smith, Neal and Robins, Matthew (2025) Foundations of Expected Points in Rugby Union: A Methodological Approach. Journal of Sports Analytics. ISSN 2215-020X (In Press)

    [img] Accepted Version
    File not available for download.
    Available under License Creative Commons Attribution.

    Download (722kB)

    Abstract

    This study explores the feasibility of an Expected Points metric for rugby union, aiming to shift performance analysis from descriptive indicators to a predictive metric of possession quality. Notational analysis was conducted on 132 Premiership Rugby matches, producing a dataset of 35,199 unique phases of play containing variables such as team in possession, pitch location, play type, score differences, time remaining and scoring outcomes. Four machine learning algorithms were explored to predict scoring outcomes: multinomial logistic regression, random forest, support vector machine and k-nearest neighbors. After extensive feature engineering and hyperparameter optimisation, the best-performing model achieved 39.7% accuracy, below a literature-derived baseline for practical usability (44.3%), making it unsuitable for applied contexts. A key challenge was predicting minority scoring outcomes due to severe class imbalance. SMOTE was explored to address this imbalance, resulting in a lower accuracy (35.7%) but an improved 34.4% F1-score. This study highlights the limitations of modelling scoring outcomes in open-play team sports, challenging the predominant positivist paradigm in sports performance analysis. The methodology provides critical foundational groundwork and a benchmark for future research to build upon. It recommends exploring advanced samplers for minority classes, expanded feature sets and alternative modelling techniques, such as recurrent neural networks.

    Impact and Reach

    Statistics

    Activity Overview
    6 month trend
    5Downloads
    6 month trend
    10Hits

    Additional statistics for this dataset are available via IRStats2.

    Repository staff only

    Edit record Edit record