Manchester Metropolitan University's Research Repository

    Performance Analysis of Distributed Deep Learning Frameworks in a Multi-GPU Environment

    Kavarakuntla, Tulasi, Han, Liangxiu ORCID logoORCID: https://orcid.org/0000-0003-2491-7473, Lloyd, huw ORCID logoORCID: https://orcid.org/0000-0001-6537-4036, Latham, Annabel ORCID logoORCID: https://orcid.org/0000-0002-8410-7950 and Akintoye, Samson B (2022) Performance Analysis of Distributed Deep Learning Frameworks in a Multi-GPU Environment. In: 2021 20th International Conference on Ubiquitous Computing and Communications, 20 December 2021 - 22 December 2021, London, UK.

    Accepted Version
    Download (427kB) | Preview


    Deep Learning frameworks, such as TensorFlow, MXNet, Chainer, provide many basic building blocks for designing effective neural network models for various applications (e.g. computer vision, speech recognition, natural language processing). However, run-time performance of these deep learning frameworks varies significantly even when training identical deep network models on the same GPUs. This study presents an experimental analysis and performance model for assessing deep learning models (Convolutional Neural Networks (CNNs), Multilayer Perceptrons (MLP), Autoencoder) on three frameworks: TensorFlow, MXNet, and Chainer, in a multi-GPU environment. We analyse factors that influence these frameworks' performance by computing the running time of each framework in our proposed model, taking load imbalance factor into account. The evaluation results highlight significiant differences in the scalability of the frameworks, and the importance of load balance in parallel distributed deep learning.

    Impact and Reach


    Activity Overview
    6 month trend
    6 month trend

    Additional statistics for this dataset are available via IRStats2.


    Repository staff only

    Edit record Edit record