HOT3D: Hand and Object Tracking in 3D from Egocentric Multi-View Videos
We introduce HOT3D, a publicly available dataset for egocentric hand and object tracking in 3D. The dataset offers over 833 minutes (3.7M+ images) of recordings that feature 19 subjects interacting with 33 diverse rigid objects. In addition to simple pickup, observe, and put-down actions, the subjec...
Gespeichert in:
| Veröffentlicht in: | Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) S. 7061 - 7071 |
|---|---|
| Hauptverfasser: | , , , , , , , , , , , , , |
| Format: | Tagungsbericht |
| Sprache: | Englisch |
| Veröffentlicht: |
IEEE
10.06.2025
|
| Schlagworte: | |
| ISSN: | 1063-6919 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | We introduce HOT3D, a publicly available dataset for egocentric hand and object tracking in 3D. The dataset offers over 833 minutes (3.7M+ images) of recordings that feature 19 subjects interacting with 33 diverse rigid objects. In addition to simple pickup, observe, and put-down actions, the subjects perform actions typical for a kitchen, office, and living room environment. The recordings include multiple synchronized data streams containing egocentric multi-view RGB/monochrome images, eye gaze signal, scene point clouds, and 3D poses of cameras, hands, and objects. The dataset is recorded with two headsets from Meta: Project Aria, which is a research prototype of AI glasses, and Quest 3, a virtual-reality headset that has shipped millions of units. Ground-truth poses were obtained by a motion-capture system using small optical markers attached to hands and objects. Hand annotations are provided in the UmeTrack and MANO formats, and objects are represented by 3D meshes with PBR materials obtained by an in-house scanner. In our experiments, we demonstrate the effectiveness of multi-view egocentric data for three popular tasks: 3D hand tracking, model-based 6DoF object pose estimation, and 3D lifting of unknown in-hand objects. The evaluated multi-view methods, whose benchmarking is uniquely enabled by HOT3D, significantly outperform their single-view counterparts. |
|---|---|
| AbstractList | We introduce HOT3D, a publicly available dataset for egocentric hand and object tracking in 3D. The dataset offers over 833 minutes (3.7M+ images) of recordings that feature 19 subjects interacting with 33 diverse rigid objects. In addition to simple pickup, observe, and put-down actions, the subjects perform actions typical for a kitchen, office, and living room environment. The recordings include multiple synchronized data streams containing egocentric multi-view RGB/monochrome images, eye gaze signal, scene point clouds, and 3D poses of cameras, hands, and objects. The dataset is recorded with two headsets from Meta: Project Aria, which is a research prototype of AI glasses, and Quest 3, a virtual-reality headset that has shipped millions of units. Ground-truth poses were obtained by a motion-capture system using small optical markers attached to hands and objects. Hand annotations are provided in the UmeTrack and MANO formats, and objects are represented by 3D meshes with PBR materials obtained by an in-house scanner. In our experiments, we demonstrate the effectiveness of multi-view egocentric data for three popular tasks: 3D hand tracking, model-based 6DoF object pose estimation, and 3D lifting of unknown in-hand objects. The evaluated multi-view methods, whose benchmarking is uniquely enabled by HOT3D, significantly outperform their single-view counterparts. |
| Author | Shkodrani, Sindi Moulon, Pierre Fountain, Jade Hodan, Tomas Han, Shangchen Wang, Robert Zhang, Fan Banerjee, Prithviraj Engel, Jakob Julian Basol, Selen Hampali, Shreyas Zhang, Linguang Newcombe, Richard Miller, Edward |
| Author_xml | – sequence: 1 givenname: Prithviraj surname: Banerjee fullname: Banerjee, Prithviraj organization: Meta Reality Labs – sequence: 2 givenname: Sindi surname: Shkodrani fullname: Shkodrani, Sindi organization: Meta Reality Labs – sequence: 3 givenname: Pierre surname: Moulon fullname: Moulon, Pierre organization: Meta Reality Labs – sequence: 4 givenname: Shreyas surname: Hampali fullname: Hampali, Shreyas organization: Meta Reality Labs – sequence: 5 givenname: Shangchen surname: Han fullname: Han, Shangchen organization: Meta Reality Labs – sequence: 6 givenname: Fan surname: Zhang fullname: Zhang, Fan organization: Meta Reality Labs – sequence: 7 givenname: Linguang surname: Zhang fullname: Zhang, Linguang organization: Meta Reality Labs – sequence: 8 givenname: Jade surname: Fountain fullname: Fountain, Jade organization: Meta Reality Labs – sequence: 9 givenname: Edward surname: Miller fullname: Miller, Edward organization: Meta Reality Labs – sequence: 10 givenname: Selen surname: Basol fullname: Basol, Selen organization: Meta Reality Labs – sequence: 11 givenname: Richard surname: Newcombe fullname: Newcombe, Richard organization: Meta Reality Labs – sequence: 12 givenname: Robert surname: Wang fullname: Wang, Robert organization: Meta Reality Labs – sequence: 13 givenname: Jakob Julian surname: Engel fullname: Engel, Jakob Julian organization: Meta Reality Labs – sequence: 14 givenname: Tomas surname: Hodan fullname: Hodan, Tomas organization: Meta Reality Labs |
| BookMark | eNotkMtKw0AYRkdRsNa8QRfzAonzz5_MxZ20tREqEQnZlslcytQ2kSQivr0WXXyc3eHw3ZKrru88IQtgGQDT98vm9a3gEvOMM15kjAnBL0iipVaIUOQocnVJZsAEpkKDviHJOB4YY8gBhFYzsimrGlcPtDSdo-dV7cHbidaDse-x29PYUVzRMPQnut731nfTEC19-TxOMW2i_6JNdL4f78h1MMfRJ_-ck_ppXS_LdFttnpeP2zRqnFIrjFMycC6dzJ23gRWqNa0SWgaheciLXHP4rQ68BUDXKmucbZ0AxZ0EhnOy-NNG7_3uY4gnM3zvzl9wIRB_AB6vTRM |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IH CBEJK RIE RIO |
| DOI | 10.1109/CVPR52734.2025.00662 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE IEEE Proceedings Order Plans (POP) 1998-present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Applied Sciences |
| EISBN | 9798331543648 |
| EISSN | 1063-6919 |
| EndPage | 7071 |
| ExternalDocumentID | 11092663 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IH 6IL 6IN AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IJVOP OCL RIE RIL RIO |
| ID | FETCH-LOGICAL-i93t-c6ad87f227d74decf058bab8697f692f454921364f2b113db8cadcbd6182d7103 |
| IEDL.DBID | RIE |
| IngestDate | Wed Aug 20 06:20:56 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i93t-c6ad87f227d74decf058bab8697f692f454921364f2b113db8cadcbd6182d7103 |
| PageCount | 11 |
| ParticipantIDs | ieee_primary_11092663 |
| PublicationCentury | 2000 |
| PublicationDate | 2025-June-10 |
| PublicationDateYYYYMMDD | 2025-06-10 |
| PublicationDate_xml | – month: 06 year: 2025 text: 2025-June-10 day: 10 |
| PublicationDecade | 2020 |
| PublicationTitle | Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) |
| PublicationTitleAbbrev | CVPR |
| PublicationYear | 2025 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0003211698 |
| Score | 2.404749 |
| Snippet | We introduce HOT3D, a publicly available dataset for egocentric hand and object tracking in 3D. The dataset offers over 833 minutes (3.7M+ images) of... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 7061 |
| SubjectTerms | 3d lifting of in-hand objects 6dof object pose aria Artificial intelligence augmented reality Benchmark testing contextual ai dataset hand tracking hand-object interaction Hands Headphones in-hand object segmentation Object tracking Pose estimation smart glasses Synchronization Three-dimensional displays Training Videos virtual reality |
| Title | HOT3D: Hand and Object Tracking in 3D from Egocentric Multi-View Videos |
| URI | https://ieeexplore.ieee.org/document/11092663 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV27TsMwFLVoxcBUHkW85YE1NIlT22Htg0xthaqqW-XHNcqSoD7g9_F1Q2FhYLBkWZYsXcs69vE95xLy6DE87svURYpbFWUq8z0I9nhSgXWpsdqEYhNiMpHLZT5rxOpBCwMAIfkMnrAb_vJtbXZIlfXQHdMDCmuRlhB8L9Y6ECrMP2V4Lht5nJ_ZGyxmr-gvhtRJitQJx5o4v4qoBAwZd_65-inp_qjx6OyAM2fkCKpz0mmuj7Q5nJsL8lJM52z4TAtVWYptqpFkoR6ODBLitKwoG1IUlNDRWx3SMktDgwQ3WpTwSRelhXrTJfPxaD4ooqZOQlTmbBsZrqwULk2FFZkF43zwtdKS58LxPHUZmrAljGcu1UnCrJZGWaMt908L6y8Y7JK0q7qCK0JNoq3uZxArtMlXud8vkZk8VkYAdyCuSRfjsnrfO2GsvkNy88f4LTnB0GNqVRLfkfZ2vYN7cmw-tuVm_RD27wsC75se |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3JTsMwFLSgIMGpLEXs-MA1NIlTO-bahSBKW6Go6q3yinJJUBf4ffxMKFw4cIhkRZEivado7MmbGYRuHYaHnTS2gaBaBIlI3Mp4e7xUGG1jpaXyYRNsNEpnMz6pxepeC2OM8cNn5g6W_l--rtQaqLI2uGM6QCHbaAeis2q51oZSIe4wQ3laC-Tcs-3udPICDmNAnsRAnlBIxfkVo-JRZND85_sPUOtHj4cnG6Q5RFumPELNegOJ689zeYwesnFOevc4E6XGcI0l0CzYAZICShwXJSY9DJIS3H-t_GBmobAX4QbTwnzgaaFNtWyhfNDPu1lQJyUEBSerQFGhU2bjmGmWaKOsK78UMqWcWcpjm4ANW0RoYmMZRUTLVAmtpKbucKHdFoOcoEZZleYUYRVJLTuJCQUY5QvuOsYSxUOhmKHWsDPUgrrM3768MObfJTn_4_4N2svy5-F8-Dh6ukD70AYYtIrCS9RYLdbmCu2q91WxXFz7Xn4CSvyeZw |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%28IEEE+Computer+Society+Conference+on+Computer+Vision+and+Pattern+Recognition.+Online%29&rft.atitle=HOT3D%3A+Hand+and+Object+Tracking+in+3D+from+Egocentric+Multi-View+Videos&rft.au=Banerjee%2C+Prithviraj&rft.au=Shkodrani%2C+Sindi&rft.au=Moulon%2C+Pierre&rft.au=Hampali%2C+Shreyas&rft.date=2025-06-10&rft.pub=IEEE&rft.eissn=1063-6919&rft.spage=7061&rft.epage=7071&rft_id=info:doi/10.1109%2FCVPR52734.2025.00662&rft.externalDocID=11092663 |