Manchester Metropolitan University's Research Repository

    Explainable YouTube video identification using sufficient input subsets

    Afandi, Waleed, Bukhari, Syed Muhammad Ammar Hassan, Khan, Muhammad US, Maqsood, Tahir, Fayyaz, Muhammad AB, Ansari, Ali R and Nawaz, Raheel (2023) Explainable YouTube video identification using sufficient input subsets. IEEE Access, 11. pp. 33178-33188. ISSN 2169-3536

    Published Version
    Available under License Creative Commons Attribution Non-commercial No Derivatives.

    Download (1MB) | Preview


    Neural network models are black boxes in nature. The mechanics behind these black boxes are practically unexplainable. Having the insight into patterns identified by these algorithms can help unravel important properties of the subject in query. These artificial intelligence based algorithms are used in every domain for prediction. This research focuses on patterns formed in network traffic that can be leveraged to identify videos streaming over the network. The proposed work uses a sufficient input subset (SIS) model on two separate video identification techniques to understand and explain the patterns detected by the techniques. The first technique creates the fingerprints of videos on a period-based algorithm to handle variable bitrate inconsistencies. These fingerprints are passed to a convolutional Neural Network (CNN) for pattern recognition. The second technique is based on traffic pattern plot identification that creates a graph of packet size with respect to time for each stream before passing that to a CNN as an image. For model explainability, a sufficient input subset (SIS) model is used to identify features that are sufficient to reach the same prediction under a certain threshold of confidence by the model. The generated SIS of each input sample is clustered using DBSCAN, K-Means, and cosine-based Hierarchical clustering. The clustered SIS highlight the common patterns for each class. The SIS patterns learnt by each model of three individual videos are discussed. Furthermore, these patterns are used to investigate misclassification and provide a rationale behind it to justify the working of the classifier model.

    Impact and Reach


    Activity Overview
    6 month trend
    6 month trend

    Additional statistics for this dataset are available via IRStats2.


    Repository staff only

    Edit record Edit record