MonOri: Orientation-Guided PnP for Monocular 3-D Object Detection
Monocular 3-D object detection is a challenging task in the field of autonomous driving and has made great progress. However, current monocular image methods tend to incorporate additional information such as pseudolabels to improve algorithm performance while overlooking the geometric relationship...
Saved in:
| Published in: | IEEE transaction on neural networks and learning systems Vol. 36; no. 10; pp. 19068 - 19080 |
|---|---|
| Main Authors: | , , , , , , , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
United States
IEEE
01.10.2025
|
| Subjects: | |
| ISSN: | 2162-237X, 2162-2388, 2162-2388 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Monocular 3-D object detection is a challenging task in the field of autonomous driving and has made great progress. However, current monocular image methods tend to incorporate additional information such as pseudolabels to improve algorithm performance while overlooking the geometric relationship between the object's keypoints, resulting in low performance for occluded object detection. To address this issue, we find that introducing the orientation information of objects in the 3-D detection pipeline can help improve the detection performance of occluded objects. An orientation-guided perspective-n-point (PnP) for monocular 3-D object detection method named MonOri is presented in this article, which uses object's orientation to guide keypoints' optimization. Considering the existence of different deformation objects in the scene, we design the feature aggregation detection module (FADM), which consists of the feature focus fusion module (FFFM) and CondConv detection module (CCDM). First, FFFM can highlight signals from irregularly occluded objects, effectively modeling features of elongated and small-sized objects. This module enhances the model's ability to recognize elongated and small-sized objects in complex scenes. Then, the CCDM is designed to improve the network's ability to estimate object keypoints' location regression under occlusion conditions and minimize the network computational overhead. Finally, considering that the unoccluded portions of occluded objects are closely related to the orientation of the objects, an orientation-guided keypoints' selection module (OGKSM) is proposed to enhance the accuracy of objected optimization for keypoint positions and spatial location inference of the object. Experimental results indicate that the MonOri method achieves competitive results; it is also demonstrated that the orientation information is introduced in the PnP algorithm to estimate the object's spatial position that can mitigate the impact of occlusion on object detection, thus improving the recognition rate of occluded objects. Our code is available at https://github.com/DL-YHD/MonOri |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| ISSN: | 2162-237X 2162-2388 2162-2388 |
| DOI: | 10.1109/TNNLS.2025.3577618 |