Masked Autoencoder for Self-Supervised Pre-Training on Lidar Point Clouds
Uloženo v:
| Název: | Masked Autoencoder for Self-Supervised Pre-Training on Lidar Point Clouds |
|---|---|
| Autoři: | Hess, Georg, 1996, Jaxing, Johan, Svensson, Elias, Hagerman Olzon, David, 1987, Petersson, Christoffer, 1979, Svensson, Lennart, 1976 |
| Zdroj: | IEEE Workshop on Applications of Computer Vision (WACV), Waikoloa, USA Proceedings - 2023 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, WACVW 2023. :350-359 |
| Témata: | 3d object detection, Self-supervised, Object detection, Voxel-MAE, Deep learning, Masked autoencoding |
| Popis: | Masked autoencoding has become a successful pretraining paradigm for Transformer models for text, images, and, recently, point clouds. Raw automotive datasets are suitable candidates for self-supervised pre-training as they generally are cheap to collect compared to annotations for tasks like 3D object detection (OD). However, the development of masked autoencoders for point clouds has focused solely on synthetic and indoor data. Consequently, existing methods have tailored their representations and models toward small and dense point clouds with homogeneous point densities. In this work, we study masked autoencoding for point clouds in an automotive setting, which are sparse and for which the point density can vary drastically among objects in the same scene. To this end, we propose Voxel-MAE, a simple masked autoencoding pre-training scheme designed for voxel representations. We pre-train the backbone of a Transformer-based 3D object detector to reconstruct masked voxels and to distinguish between empty and non-empty voxels. Our method improves the 3D OD performance by 1.75 mAP points and 1.05 NDS on the challenging nuScenes dataset. Further, we show that by pre-training with Voxel-MAE, we require only 40 of the annotated data to outperform a randomly initialized equivalent. Code is available at https://github.com/georghess/voxel-mae. |
| Popis souboru: | electronic |
| Přístupová URL adresa: | https://research.chalmers.se/publication/536005 https://research.chalmers.se/publication/535108 https://openaccess.thecvf.com/content/WACV2023W/Pretrain/papers/Hess_Masked_Autoencoder_for_Self-Supervised_Pre-Training_on_Lidar_Point_Clouds_WACVW_2023_paper.pdf |
| Databáze: | SwePub |
| FullText | Text: Availability: 0 CustomLinks: – Url: https://research.chalmers.se/publication/536005# Name: EDS - SwePub (s4221598) Category: fullText Text: View record in SwePub – Url: https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=EBSCO&SrcAuth=EBSCO&DestApp=WOS&ServiceName=TransferToWoS&DestLinkType=GeneralSearchSummary&Func=Links&author=Hess%20G Name: ISI Category: fullText Text: Nájsť tento článok vo Web of Science Icon: https://imagesrvr.epnet.com/ls/20docs.gif MouseOverText: Nájsť tento článok vo Web of Science |
|---|---|
| Header | DbId: edsswe DbLabel: SwePub An: edsswe.oai.research.chalmers.se.29af9b79.4aec.4bd5.bc15.c7e443653987 RelevancyScore: 956 AccessLevel: 6 PubType: Conference PubTypeId: conference PreciseRelevancyScore: 955.779541015625 |
| IllustrationInfo | |
| Items | – Name: Title Label: Title Group: Ti Data: Masked Autoencoder for Self-Supervised Pre-Training on Lidar Point Clouds – Name: Author Label: Authors Group: Au Data: <searchLink fieldCode="AR" term="%22Hess%2C+Georg%22">Hess, Georg</searchLink>, 1996<br /><searchLink fieldCode="AR" term="%22Jaxing%2C+Johan%22">Jaxing, Johan</searchLink><br /><searchLink fieldCode="AR" term="%22Svensson%2C+Elias%22">Svensson, Elias</searchLink><br /><searchLink fieldCode="AR" term="%22Hagerman+Olzon%2C+David%22">Hagerman Olzon, David</searchLink>, 1987<br /><searchLink fieldCode="AR" term="%22Petersson%2C+Christoffer%22">Petersson, Christoffer</searchLink>, 1979<br /><searchLink fieldCode="AR" term="%22Svensson%2C+Lennart%22">Svensson, Lennart</searchLink>, 1976 – Name: TitleSource Label: Source Group: Src Data: <i>IEEE Workshop on Applications of Computer Vision (WACV), Waikoloa, USA Proceedings - 2023 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, WACVW 2023</i>. :350-359 – Name: Subject Label: Subject Terms Group: Su Data: <searchLink fieldCode="DE" term="%223d+object+detection%22">3d object detection</searchLink><br /><searchLink fieldCode="DE" term="%22Self-supervised%22">Self-supervised</searchLink><br /><searchLink fieldCode="DE" term="%22Object+detection%22">Object detection</searchLink><br /><searchLink fieldCode="DE" term="%22Voxel-MAE%22">Voxel-MAE</searchLink><br /><searchLink fieldCode="DE" term="%22Deep+learning%22">Deep learning</searchLink><br /><searchLink fieldCode="DE" term="%22Masked+autoencoding%22">Masked autoencoding</searchLink> – Name: Abstract Label: Description Group: Ab Data: Masked autoencoding has become a successful pretraining paradigm for Transformer models for text, images, and, recently, point clouds. Raw automotive datasets are suitable candidates for self-supervised pre-training as they generally are cheap to collect compared to annotations for tasks like 3D object detection (OD). However, the development of masked autoencoders for point clouds has focused solely on synthetic and indoor data. Consequently, existing methods have tailored their representations and models toward small and dense point clouds with homogeneous point densities. In this work, we study masked autoencoding for point clouds in an automotive setting, which are sparse and for which the point density can vary drastically among objects in the same scene. To this end, we propose Voxel-MAE, a simple masked autoencoding pre-training scheme designed for voxel representations. We pre-train the backbone of a Transformer-based 3D object detector to reconstruct masked voxels and to distinguish between empty and non-empty voxels. Our method improves the 3D OD performance by 1.75 mAP points and 1.05 NDS on the challenging nuScenes dataset. Further, we show that by pre-training with Voxel-MAE, we require only 40 of the annotated data to outperform a randomly initialized equivalent. Code is available at https://github.com/georghess/voxel-mae. – Name: Format Label: File Description Group: SrcInfo Data: electronic – Name: URL Label: Access URL Group: URL Data: <link linkTarget="URL" linkTerm="https://research.chalmers.se/publication/536005" linkWindow="_blank">https://research.chalmers.se/publication/536005</link><br /><link linkTarget="URL" linkTerm="https://research.chalmers.se/publication/535108" linkWindow="_blank">https://research.chalmers.se/publication/535108</link><br /><link linkTarget="URL" linkTerm="https://openaccess.thecvf.com/content/WACV2023W/Pretrain/papers/Hess_Masked_Autoencoder_for_Self-Supervised_Pre-Training_on_Lidar_Point_Clouds_WACVW_2023_paper.pdf" linkWindow="_blank">https://openaccess.thecvf.com/content/WACV2023W/Pretrain/papers/Hess_Masked_Autoencoder_for_Self-Supervised_Pre-Training_on_Lidar_Point_Clouds_WACVW_2023_paper.pdf</link> |
| PLink | https://erproxy.cvtisr.sk/sfx/access?url=https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsswe&AN=edsswe.oai.research.chalmers.se.29af9b79.4aec.4bd5.bc15.c7e443653987 |
| RecordInfo | BibRecord: BibEntity: Identifiers: – Type: doi Value: 10.1109/WACVW58289.2023.00039 Languages: – Text: English PhysicalDescription: Pagination: PageCount: 10 StartPage: 350 Subjects: – SubjectFull: 3d object detection Type: general – SubjectFull: Self-supervised Type: general – SubjectFull: Object detection Type: general – SubjectFull: Voxel-MAE Type: general – SubjectFull: Deep learning Type: general – SubjectFull: Masked autoencoding Type: general Titles: – TitleFull: Masked Autoencoder for Self-Supervised Pre-Training on Lidar Point Clouds Type: main BibRelationships: HasContributorRelationships: – PersonEntity: Name: NameFull: Hess, Georg – PersonEntity: Name: NameFull: Jaxing, Johan – PersonEntity: Name: NameFull: Svensson, Elias – PersonEntity: Name: NameFull: Hagerman Olzon, David – PersonEntity: Name: NameFull: Petersson, Christoffer – PersonEntity: Name: NameFull: Svensson, Lennart IsPartOfRelationships: – BibEntity: Dates: – D: 01 M: 01 Type: published Y: 2023 Identifiers: – Type: issn-locals Value: SWEPUB_FREE – Type: issn-locals Value: CTH_SWEPUB Titles: – TitleFull: IEEE Workshop on Applications of Computer Vision (WACV), Waikoloa, USA Proceedings - 2023 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, WACVW 2023 Type: main |
| ResultId | 1 |
Nájsť tento článok vo Web of Science