Performance Analysis of Distributed Deep Learning Frameworks in a Multi-GPU Environment

Kavarakuntla, Tulasi, Han, Liangxiu ORCID: https://orcid.org/0000-0003-2491-7473, Lloyd, huw ORCID: https://orcid.org/0000-0001-6537-4036, Latham, Annabel ORCID: https://orcid.org/0000-0002-8410-7950 and Akintoye, Samson B (2022) Performance Analysis of Distributed Deep Learning Frameworks in a Multi-GPU Environment. In: 2021 20th International Conference on Ubiquitous Computing and Communications. Presented at 2021 20th International Conference on Ubiquitous Computing and Communications, 20 December 2021 - 22 December 2021, London, UK.

Preview

Accepted Version
Available under License In Copyright.
Download (427kB) | Preview

Official URL: https://ieeexplore.ieee.org/document/9719624

Abstract

Deep Learning frameworks, such as TensorFlow, MXNet, Chainer, provide many basic building blocks for designing effective neural network models for various applications (e.g. computer vision, speech recognition, natural language processing). However, run-time performance of these deep learning frameworks varies significantly even when training identical deep network models on the same GPUs. This study presents an experimental analysis and performance model for assessing deep learning models (Convolutional Neural Networks (CNNs), Multilayer Perceptrons (MLP), Autoencoder) on three frameworks: TensorFlow, MXNet, and Chainer, in a multi-GPU environment. We analyse factors that influence these frameworks' performance by computing the running time of each framework in our proposed model, taking load imbalance factor into account. The evaluation results highlight significiant differences in the scalability of the frameworks, and the importance of load balance in parallel distributed deep learning.

Item Type:	Conference or Workshop Item (Paper)
Published Proceedings:	2021 20th International Conference on Ubiquitous Computing and Communications
Peer-reviewed:	Yes
Date Deposited:	01 Apr 2022 09:28
Publisher:	IEEE
Additional Information:	This is an Author Accepted Manuscript of an paper published in the 2021 20th International Conference on Ubiquitous Computing and Communications by IEEE.
Divisions:	Organisation > Science and Engineering
URI:	https://e-space.mmu.ac.uk/id/eprint/629110
DOI:	https://doi.org/10.1109/IUCC-CIT-DSCI-SmartCNS55181.2021.00071

Impact and Reach

Statistics

DownloadsShow export options

Activity Overview

6 month trend

177Downloads

6 month trend

207Hits

Additional statistics for this dataset are available via IRStats2.

Altmetric

Repository staff only

Edit record