SABA: Scale-adaptive Attention and Boundary Aware Network for real-time semantic segmentation

Balancing accuracy and speed is crucial for semantic segmentation in autonomous driving. While various mechanisms have been explored to enhance segmentation accuracy in lightweight deep learning networks, adding more mechanisms does not always lead to better performance and often significantly incre...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	Expert systems with applications Ročník 282; s. 127680
Hlavní autori:	Luo, Huilan, Liu, Chunyan, Shark, Lik-Kwan
Médium:	Journal Article
Jazyk:	English
Vydavateľské údaje:	Elsevier Ltd 05.07.2025
Predmet:	Boundary awareness Context enhancement Multi-scale adaptive attention Real-time semantic segmentation Multi-scale adaptive attention Context enhancement Boundary awareness Real-time semantic segmentation
ISSN:	0957-4174
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Popis
Shrnutí:	Balancing accuracy and speed is crucial for semantic segmentation in autonomous driving. While various mechanisms have been explored to enhance segmentation accuracy in lightweight deep learning networks, adding more mechanisms does not always lead to better performance and often significantly increases processing time. This paper investigates a more effective and efficient integration of three key mechanisms — context, attention, and boundary — to improve real-time semantic segmentation of road scene images. Based on an analysis of recent fully convolutional encoder–decoder networks, we propose a novel Scale-adaptive Attention and Boundary Aware (SABA) segmentation network. SABA enhances context through a new pyramid structure with multi-scale residual learning, refines attention via scale-adaptive spatial relationships, and improves boundary delineation using progressive refinement with a dedicated loss function and learnable weights. Evaluations on the Cityscapes benchmark show that SABA outperforms current real-time semantic segmentation networks, achieving a mean intersection over union (mIoU) of up to 76.7% and improving accuracy for 17 out of 19 object classes. Moreover, it achieves this accuracy at an inference speed of up to 83.4 frames per second, significantly exceeding real-time video frame rates. The code is available at https://github.com/liuchunyan66/SABA.
ISSN:	0957-4174
DOI:	10.1016/j.eswa.2025.127680