FastTuning: Enabling Fast and Efficient Hyper-Parameter Tuning With Partitioning and Parallelism of Search Space
Hyper-parameter tuning (HPT) for deep learning (DL) models is prohibitively expensive. Sequential model-based optimization (SMBO) emerges as the state-of-the-art (SOTA) approach to automatically optimize HPT performance due to its heuristic advantages. Unfortunately, focusing on algorithm optimizati...
Uložené v:
| Vydané v: | IEEE transactions on parallel and distributed systems Ročník 35; číslo 7; s. 1174 - 1188 |
|---|---|
| Hlavní autori: | , , , , , , , , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
New York
IEEE
01.07.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Predmet: | |
| ISSN: | 1045-9219, 1558-2183 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Shrnutí: | Hyper-parameter tuning (HPT) for deep learning (DL) models is prohibitively expensive. Sequential model-based optimization (SMBO) emerges as the state-of-the-art (SOTA) approach to automatically optimize HPT performance due to its heuristic advantages. Unfortunately, focusing on algorithm optimization rather than a large-scale parallel HPT system, existing SMBO-based approaches still cannot effectively remove their strong sequential nature, posing two performance problems: (1) extremely low tuning speed and (2) sub-optimal model quality . In this paper, we propose FastTuning, a fast, scalable, and generic system aiming at parallelly accelerating SMBO-based HPT for large DL/ML models. The key is to partition the highly complex search space into multiple smaller sub-spaces, each of which is assigned to and optimized by a different tuning worker in parallel. However, determining the right level of resource allocation to strike a balance between quality and cost remains a challenge. To address this, we further propose NIMBLE, a dynamic scheduling strategy that is specially designed for FastTuning, including (1) Dynamic Elimination Algorithm, (2) Sub-space Re-division, and (3) Posterior Information Sharing. Finally, we incorporate 6 SOTAs (i.e., 3 tuning algorithms and 3 parallel tuning tools) into FastTuning. Experimental results, on ResNet18, VGG19, ResNet50, and ResNet152, show that FastTuning can consistently offer much faster tuning speed (up to <inline-formula><tex-math notation="LaTeX">80\times</tex-math> <mml:math><mml:mrow><mml:mn>80</mml:mn><mml:mo>×</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="guo-ieq1-3386939.gif"/> </inline-formula>) with better accuracy (up to 4.7% improvement), thereby enabling the application of automatic HPT to real-life DL models. |
|---|---|
| Bibliografia: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 1045-9219 1558-2183 |
| DOI: | 10.1109/TPDS.2024.3386939 |