FTTN: Feature-Targeted Testing for Numerical Properties of NVIDIA & AMD Matrix Accelerators
NVIDIA Tensor Cores and AMD Matrix Cores (together called Matrix Accelerators) are of growing interest in high-performance computing and machine learning owing to their high performance. Unfortunately, some of their crucial numerical attributes pertaining to departures from full IEEE floating-point...
Uloženo v:
| Vydáno v: | IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing s. 39 - 46 |
|---|---|
| Hlavní autoři: | , , , , , |
| Médium: | Konferenční příspěvek |
| Jazyk: | angličtina |
| Vydáno: |
IEEE
06.05.2024
|
| Témata: | |
| ISSN: | 2993-2114 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | NVIDIA Tensor Cores and AMD Matrix Cores (together called Matrix Accelerators) are of growing interest in high-performance computing and machine learning owing to their high performance. Unfortunately, some of their crucial numerical attributes pertaining to departures from full IEEE floating-point compatibility are not documented. This makes it impossible to reliably port codes across these differing accelerators. This paper contributes a collection of Feature Targeted Tests for Numerical Properties that that help determine these features across five floating-point formats, four rounding modes and additional that highlight the rounding behaviors and preservation of extra precision bits. To show the practical relevance of FTTN, we design a simple matrix-multiplication test designed with insights gathered from our feature-tests. We executed this very simple test on five platforms, producing different answers: V100, A100, and MI250X produced 0, MI100 produced 255.875, and Hopper H100 produced 191.875. Our matrix multiplication tests employ patterns found in iterative refinement-based algorithms, highlighting the need to check for significant result variability when porting code across GPUs. |
|---|---|
| ISSN: | 2993-2114 |
| DOI: | 10.1109/CCGrid59990.2024.00014 |