Scene Text Detection Using HRNet and Spatial Attention Mechanism
To better extract the features from text instances with various shapes, a scene text detector using High Resolution Net (HRNet) and spatial attention mechanism is proposed in this paper. Specifically, we use HRNetv2-W18 as the backbone network to extract the text feature in text instances with compl...
Uloženo v:
| Vydáno v: | Programming and computer software Ročník 49; číslo 8; s. 954 - 965 |
|---|---|
| Hlavní autoři: | , , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Moscow
Pleiades Publishing
01.12.2023
|
| Témata: | |
| ISSN: | 0361-7688, 1608-3261 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | To better extract the features from text instances with various shapes, a scene text detector using High Resolution Net (HRNet) and spatial attention mechanism is proposed in this paper. Specifically, we use HRNetv2-W18 as the backbone network to extract the text feature in text instances with complex shapes. Considering that the scene text instance is usually small, to avoid too small feature size, we optimize HRNet through deformable convolution and Smooth Maximum Unit (SMU) activation function, so that the network can retain more detail information and location information of the text instance. In addition, a Text Region Attention Module (TRAM) is added after the backbone to make it pay more attention to the text location information and a loss function is used to TRAM, so that the network can learn the features better. The experimental results illustrate that the proposed method can compete with the state-of-the-art methods. Code is available at:
https://github.com/zhangyan1005/HR-DBNet
. |
|---|---|
| ISSN: | 0361-7688 1608-3261 |
| DOI: | 10.1134/S0361768823080212 |