AToM: Adaptive Token Merging for Efficient Acceleration of Vision Transformer
Recently, Vision Transformers (ViTs) have set a new standard in computer vision (CV), showing unparalleled image processing performance. However, their substantial computational requirements hinder practical deployment, especially on resource-limited devices common in CV applications. Token merging...
Uloženo v:
| Vydáno v: | IEEE transactions on computers Ročník 74; číslo 5; s. 1620 - 1633 |
|---|---|
| Hlavní autoři: | , , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
IEEE
01.05.2025
|
| Témata: | |
| ISSN: | 0018-9340, 1557-9956 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Recently, Vision Transformers (ViTs) have set a new standard in computer vision (CV), showing unparalleled image processing performance. However, their substantial computational requirements hinder practical deployment, especially on resource-limited devices common in CV applications. Token merging has emerged as a solution, condensing tokens with similar features to cut computational and memory demands. Yet, existing applications on ViTs often miss the mark in token compression, with rigid merging strategies and a lack of in-depth analysis of ViT merging characteristics. To overcome these issues, this paper introduces Adaptive Token Merging (AToM), a comprehensive algorithm-architecture co-design for accelerating ViTs. The AToM algorithm employs an image-adaptive, fine-grained merging strategy, significantly boosting computational efficiency. We also optimize the merging and unmerging processes to minimize overhead, employing techniques like First-Come-First-Merge mapping and Linear Distance Calculation. On the hardware side, the AToM architecture is tailor-made to exploit the AToM algorithm's benefits, with specialized engines for efficient merge and unmerge operations. Our pipeline architecture ensures end-to-end ViT processing, minimizing latency and memory overhead from the AToM algorithm. Across various hardware platforms including CPU, EdgeGPU, and GPU, AToM achieves average end-to-end speedups of 10.9<inline-formula><tex-math notation="LaTeX">\boldsymbol{\times}</tex-math> <mml:math display="inline"><mml:mo mathvariant="bold">×</mml:mo></mml:math><inline-graphic xlink:href="shin-ieq1-3540638.gif"/> </inline-formula>, 7.7<inline-formula><tex-math notation="LaTeX">\boldsymbol{\times}</tex-math> <mml:math display="inline"><mml:mo mathvariant="bold">×</mml:mo></mml:math><inline-graphic xlink:href="shin-ieq2-3540638.gif"/> </inline-formula>, and 5.4<inline-formula><tex-math notation="LaTeX">\boldsymbol{\times}</tex-math> <mml:math display="inline"><mml:mo mathvariant="bold">×</mml:mo></mml:math><inline-graphic xlink:href="shin-ieq3-3540638.gif"/> </inline-formula>, alongside energy savings of 24.9<inline-formula><tex-math notation="LaTeX">\boldsymbol{\times}</tex-math> <mml:math display="inline"><mml:mo mathvariant="bold">×</mml:mo></mml:math><inline-graphic xlink:href="shin-ieq4-3540638.gif"/> </inline-formula>, 1.8<inline-formula><tex-math notation="LaTeX">\boldsymbol{\times}</tex-math> <mml:math display="inline"><mml:mo mathvariant="bold">×</mml:mo></mml:math><inline-graphic xlink:href="shin-ieq5-3540638.gif"/> </inline-formula>, and 16.7<inline-formula><tex-math notation="LaTeX">\boldsymbol{\times}</tex-math> <mml:math display="inline"><mml:mo mathvariant="bold">×</mml:mo></mml:math><inline-graphic xlink:href="shin-ieq6-3540638.gif"/> </inline-formula>. Moreover, AToM offers 1.2<inline-formula><tex-math notation="LaTeX">\boldsymbol{\times}</tex-math> <mml:math display="inline"><mml:mo mathvariant="bold">×</mml:mo></mml:math><inline-graphic xlink:href="shin-ieq7-3540638.gif"/> </inline-formula> 1.9<inline-formula><tex-math notation="LaTeX">\boldsymbol{\times}</tex-math> <mml:math display="inline"><mml:mo mathvariant="bold">×</mml:mo></mml:math><inline-graphic xlink:href="shin-ieq8-3540638.gif"/> </inline-formula> higher effective throughput compared to existing transformer accelerators. |
|---|---|
| ISSN: | 0018-9340 1557-9956 |
| DOI: | 10.1109/TC.2025.3540638 |