Linear Transformer Based U-Shaped Lightweight Segmentation Network
The widespread development and application of embedded medical devices necessitate the corresponding research in lightweight, energy-efficient models. Although transformer-based segmentation models have shown promise in various visual tasks, inherent challenges, including the lack of inductive bias...
Saved in:
| Published in: | Journal of advanced computational intelligence and intelligent informatics Vol. 29; no. 6; pp. 1319 - 1328 |
|---|---|
| Main Authors: | , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Tokyo
Fuji Technology Press Co. Ltd
20.11.2025
|
| Subjects: | |
| ISSN: | 1343-0130, 1883-8014 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | The widespread development and application of embedded medical devices necessitate the corresponding research in lightweight, energy-efficient models. Although transformer-based segmentation models have shown promise in various visual tasks, inherent challenges, including the lack of inductive bias and an overreliance on extensive training data, emerge when striving for optimal model efficiency. By contrast, convolutional neural networks (CNNs), with their intrinsic inductive biases and parameter-sharing mechanisms, enable a reduction in the number of parameters and a focus on capturing local features, thereby lowering computational costs. However, reliance solely on transformers does not meet the practical demands of lightweight model efficiency. Hence, the integration of CNNs with transformers presents a promising research trajectory for constructing efficient and lightweight networks. This hybrid approach leverages the strengths of CNNs in feature extraction and the ability of transformers to model global dependencies, achieving a balance between model performance and efficiency. In this paper, we propose MobileViTv2s, a novel lightweight segmentation network that integrates CNNs with a linear transformer. The proposed network efficiently extracts local features via CNNs, whereas transformers adeptly manage complex feature relationships, thereby facilitating precise segmentation in intricate contexts such as medical imaging. The model demonstrates significant potential and applicability in the advancement of lightweight deep learning models. Experimental results revealed that the proposed model achieved up to a 14.34-fold improvement in efficiency, a 9.91-fold reduction in the number of parameters, and comparable or superior segmentation accuracy, while achieving a markedly lower Hausdorff distance. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 1343-0130 1883-8014 |
| DOI: | 10.20965/jaciii.2025.p1319 |