ACR-Net: Learning High-Accuracy Optical Flow via Adaptive-Aware Correlation Recurrent Network

Although recurrent network-based optical flow estimation methods have shown great success in recent years, most of these methods have difficulty handling large displacements and occlusions because the existing recurrent networks are usually restricted to coarse-resolution single-scale models while i...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE transactions on circuits and systems for video technology Ročník 34; číslo 10; s. 9064 - 9077
Hlavní autoři: Wang, Zixu, Zhang, Congxuan, Chen, Zhen, Hu, Weiming, Lu, Ke, Ge, Liyue, Wang, Zige
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York IEEE 01.10.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:1051-8215, 1558-2205
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Although recurrent network-based optical flow estimation methods have shown great success in recent years, most of these methods have difficulty handling large displacements and occlusions because the existing recurrent networks are usually restricted to coarse-resolution single-scale models while ignoring the multiscale features brought by hierarchical concepts in previous coarse-to-fine approaches. In this paper, we propose an adaptive-aware correlation recurrent network for optical flow estimation, named ACR-Net, which preserves fine motion features with a single-scale resolution recurrent framework and adaptively incorporates multiscale features at different stages to achieve high-accuracy optical flow estimation. First, our proposed self-adaptation scale-aware correlation module can incorporate the adaptive correlation of multiscale inter- and intra-motion features, which makes the features more discriminative for capturing long-range dependencies between pixels. Second, our presented adaptive-aware motion module can effectively extract the required features of different kinds of motion from multilevel correspondence. Third, our introduced cross-guide motion and fusion modules can accurately guide the propagation of reliable pixels towards unreliable pixels and dynamically determine the most suitable expression to address the occlusion challenges. Comprehensive experiments demonstrate that ACR-Net outperforms existing two-view models, striking a good balance between speed and accuracy and achieving the best performance on the MPI-Sintel final pass and KITTI-2015 test datasets. Source code is available at https://github.com/PCwenyue/ACR-Net-TCSVT .
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1051-8215
1558-2205
DOI:10.1109/TCSVT.2024.3395636