Performance Characterization of Popular DNN Models on Out-of-Order CPUs

DNN popularity, which is driving advances in a growing number of fields, has increased the amount of computing resources running this kind of applications at an unprecedent rate. Specialized hardware, such as GPUs or ASIC-based accelerators, has been the preferred platform to run these applications....

Full description

Saved in:
Bibliographic Details
Published in:2023 32nd International Conference on Parallel Architectures and Compilation Techniques (PACT) pp. 199 - 210
Main Authors: Prieto, Pablo, Abad, Pablo, Gregorio, Jose Angel, Puente, Valentin
Format: Conference Proceeding
Language:English
Published: IEEE 21.10.2023
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:DNN popularity, which is driving advances in a growing number of fields, has increased the amount of computing resources running this kind of applications at an unprecedent rate. Specialized hardware, such as GPUs or ASIC-based accelerators, has been the preferred platform to run these applications. However, the ubiquity of DNN models is rapidly extending the presence of this software to general-purpose CPUs. For this reason, there is a pressing need to gain understanding of the main features of state-of-the-art DNN models to adapt CPU microarchitecture accordingly. In this paper we investigated a representative set of DNN models and, based on data collected from real hardware, we evaluated how efficiently they utilize the underlying system. We analyzed overall system performance, as well as the amount of vectorization provided by CPU-optimized frameworks. We quantified the performance loss caused by processor backend, and the contribution of memory hierarchy and functional units to it. We compared the backend utilization of DNN applications to popular benchmarks such as SPEC CPU2017 and found a lower balance in the use of the elements that make up the processor microarchitecture. Although many workloads seem to be constrained by functional unit availability, in a significant group of applications we found a non-negligible impact of memory hierarchy on performance.
DOI:10.1109/PACT58117.2023.00025