Fine-Grained Visual Recognition in Mobile Augmented Reality for Technical Support
Augmented Reality is increasingly explored as the new medium for two-way remote collaboration applications to guide the participants more effectively and efficiently via visual instructions. As users strive for more natural interaction and automation in augmented reality applications, new visual rec...
Uloženo v:
| Vydáno v: | IEEE transactions on visualization and computer graphics Ročník 26; číslo 12; s. 3514 - 3523 |
|---|---|
| Hlavní autoři: | , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
New York
IEEE
01.12.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Témata: | |
| ISSN: | 1077-2626, 1941-0506, 1941-0506 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Augmented Reality is increasingly explored as the new medium for two-way remote collaboration applications to guide the participants more effectively and efficiently via visual instructions. As users strive for more natural interaction and automation in augmented reality applications, new visual recognition techniques are needed to enhance the user experience. Although simple object recognition is often used in augmented reality towards this goal, most collaboration tasks are too complex for such recognition algorithms to suffice. In this paper, we propose a fine-grained visual recognition approach for mobile augmented reality, which leverages RGB video frames and sparse depth feature points identified in real-time, as well as camera pose data to detect various visual states of an object. We demonstrate the value of our approach through a mobile application designed for hardware support, which automatically detects the state of an object to present the right set of information in the right context. |
|---|---|
| AbstractList | Augmented Reality is increasingly explored as the new medium for two-way remote collaboration applications to guide the participants more effectively and efficiently via visual instructions. As users strive for more natural interaction and automation in augmented reality applications, new visual recognition techniques are needed to enhance the user experience. Although simple object recognition is often used in augmented reality towards this goal, most collaboration tasks are too complex for such recognition algorithms to suffice. In this paper, we propose a fine-grained visual recognition approach for mobile augmented reality, which leverages RGB video frames and sparse depth feature points identified in real-time, as well as camera pose data to detect various visual states of an object. We demonstrate the value of our approach through a mobile application designed for hardware support, which automatically detects the state of an object to present the right set of information in the right context. Augmented Reality is increasingly explored as the new medium for two-way remote collaboration applications to guide the participants more effectively and efficiently via visual instructions. As users strive for more natural interaction and automation in augmented reality applications, new visual recognition techniques are needed to enhance the user experience. Although simple object recognition is often used in augmented reality towards this goal, most collaboration tasks are too complex for such recognition algorithms to suffice. In this paper, we propose a fine-grained visual recognition approach for mobile augmented reality, which leverages RGB video frames and sparse depth feature points identified in real-time, as well as camera pose data to detect various visual states of an object. We demonstrate the value of our approach through a mobile application designed for hardware support, which automatically detects the state of an object to present the right set of information in the right context.Augmented Reality is increasingly explored as the new medium for two-way remote collaboration applications to guide the participants more effectively and efficiently via visual instructions. As users strive for more natural interaction and automation in augmented reality applications, new visual recognition techniques are needed to enhance the user experience. Although simple object recognition is often used in augmented reality towards this goal, most collaboration tasks are too complex for such recognition algorithms to suffice. In this paper, we propose a fine-grained visual recognition approach for mobile augmented reality, which leverages RGB video frames and sparse depth feature points identified in real-time, as well as camera pose data to detect various visual states of an object. We demonstrate the value of our approach through a mobile application designed for hardware support, which automatically detects the state of an object to present the right set of information in the right context. |
| Author | Guven, Sinem Zhou, Bing |
| Author_xml | – sequence: 1 givenname: Bing surname: Zhou fullname: Zhou, Bing email: bing.zhou@ibm.com organization: IBM T. J. Watson Research Center, Yorktown Heights, New York, United States – sequence: 2 givenname: Sinem surname: Guven fullname: Guven, Sinem email: sguven@us.ibm.com organization: IBM T. J. Watson Research Center, Yorktown Heights, New York, United States |
| BookMark | eNp9kLtOwzAUQC0E4v0BiCUSC0uK36lHVEFBAiGgsEaOfQ1GqV3sZODvcVXEwMB0PZxj-54DtB1iAIROCJ4QgtXF4nU2n1BM8YRhyiQTW2ifKE5qLLDcLmfcNDWVVO6hg5w_MCacT9Uu2mO0UETQffR47QPU86TLsNWrz6Puqycw8S34wcdQ-VDdx873UF2Ob0sIQ8GeQPd--KpcTNUCzHvwpljP42oV03CEdpzuMxz_zEP0cn21mN3Udw_z29nlXW0YlUMtnXaWGAMKtOVWdUBox5lygumGdJRqxa3TQhBqOtfYpmuYdJaBVUpwO2WH6Hxz7yrFzxHy0C59NtD3OkAcc0s552zKpeIFPfuDfsQxhfK7QklMhMKCFopsKJNizglcu0p-qdNXS3C77t2ue7fr3u1P7-I0fxzjB70ON5Sk_b_m6cb0APD7kiJlOzll38Vmjak |
| CODEN | ITVGEA |
| CitedBy_id | crossref_primary_10_1109_TVCG_2024_3456164 crossref_primary_10_1109_TVCG_2022_3203104 crossref_primary_10_1007_s11227_022_04490_8 crossref_primary_10_1109_LRA_2024_3468157 crossref_primary_10_1109_TVCG_2022_3203111 crossref_primary_10_1109_TNSE_2022_3155177 crossref_primary_10_1016_j_aei_2024_102857 crossref_primary_10_3390_computers11080124 |
| Cites_doi | 10.1109/ICCV.2017.557 10.1109/CVPR.2009.5206848 10.1109/CVPR.2016.128 10.1109/ISMAR-Adjunct.2019.00-42 10.1109/ISMAR.2010.5643554 10.1109/ICRA.2011.5980567 10.1109/ICCV.2017.74 10.1007/s11263-016-0911-8 10.1109/ICCV.2011.6126544 10.1109/CVPR.2015.7299194 10.1109/ICIAP.2003.1234043 10.1162/neco.1997.9.8.1735 10.1145/2501988.2502045 10.1109/CVPR.2016.132 10.1109/ICCV.2015.170 10.1007/978-3-319-10590-1_54 10.1109/CVPR.2016.90 10.1016/j.aei.2015.10.005 10.1109/CoASE.2014.6899343 10.1109/CVPR.2016.414 10.1007/s12008-013-0199-7 10.1109/TPAMI.2016.2577031 10.1109/ISMAR.2012.6402562 10.1109/ICCV.2015.114 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020 |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020 |
| DBID | 97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D 7X8 |
| DOI | 10.1109/TVCG.2020.3023635 |
| DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE/IET Electronic Library CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional MEDLINE - Academic |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional MEDLINE - Academic |
| DatabaseTitleList | Technology Research Database MEDLINE - Academic |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE/IET Electronic Library url: https://ieeexplore.ieee.org/ sourceTypes: Publisher – sequence: 2 dbid: 7X8 name: MEDLINE - Academic url: https://search.proquest.com/medline sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 1941-0506 |
| EndPage | 3523 |
| ExternalDocumentID | 10_1109_TVCG_2020_3023635 9199568 |
| Genre | orig-research |
| GroupedDBID | --- -~X .DC 0R~ 29I 4.4 53G 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACIWK AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD F5P HZ~ H~9 IEDLZ IFIPE IFJZH IPLJI JAVBF LAI M43 O9- OCL P2P PQQKQ RIA RIE RNI RNS RZB TN5 VH1 AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D 7X8 |
| ID | FETCH-LOGICAL-c326t-6fafd1cce9ead4d9be12b439f53a71b22a94dfa5512cbf7d7b736fd3ed9954d83 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 20 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000589217900014&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1077-2626 1941-0506 |
| IngestDate | Thu Oct 02 10:30:40 EDT 2025 Sun Nov 30 03:59:52 EST 2025 Sat Nov 29 06:05:43 EST 2025 Tue Nov 18 22:18:28 EST 2025 Wed Aug 27 02:28:32 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 12 |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c326t-6fafd1cce9ead4d9be12b439f53a71b22a94dfa5512cbf7d7b736fd3ed9954d83 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
| PMID | 32941152 |
| PQID | 2460159052 |
| PQPubID | 75741 |
| PageCount | 10 |
| ParticipantIDs | proquest_miscellaneous_2444384694 crossref_citationtrail_10_1109_TVCG_2020_3023635 crossref_primary_10_1109_TVCG_2020_3023635 proquest_journals_2460159052 ieee_primary_9199568 |
| PublicationCentury | 2000 |
| PublicationDate | 2020-12-01 |
| PublicationDateYYYYMMDD | 2020-12-01 |
| PublicationDate_xml | – month: 12 year: 2020 text: 2020-12-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York |
| PublicationTitle | IEEE transactions on visualization and computer graphics |
| PublicationTitleAbbrev | TVCG |
| PublicationYear | 2020 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | ref13 ref12 ref37 ref15 ref36 ref31 (ref5) 2018 ref33 ref11 kanezaki (ref16) 2016 ref32 ref10 lecun (ref19) 1995; 3361 ref17 ref38 lowe (ref21) 2004 alhaija (ref3) 0; 1 agarap (ref2) 2018 zhou (ref39) 0 (ref9) 2019 ref24 yan (ref35) 0 ref23 wu (ref34) 0 ref26 simonyan (ref30) 2014 ref25 ref20 ref22 (ref14) 2020 abadi (ref1) 0 ref28 ref27 krizhevsky (ref18) 2012 ref29 ref8 chollet (ref7) 2015 ref4 branson (ref6) 2014 ref40 |
| References_xml | – ident: ref38 doi: 10.1109/ICCV.2017.557 – volume: 3361 start-page: 1995 year: 1995 ident: ref19 article-title: Convolutional networks for images, speech, and time series publication-title: The Handbook of Brain Theory and Neural Networks – ident: ref8 doi: 10.1109/CVPR.2009.5206848 – year: 2016 ident: ref16 publication-title: RotationNet Joint Object Categorization and Pose Estimation Using Multiviews from Unsupervised Viewpoints – ident: ref37 doi: 10.1109/CVPR.2016.128 – ident: ref32 doi: 10.1109/ISMAR-Adjunct.2019.00-42 – start-page: 1912 year: 0 ident: ref34 article-title: 3d shapenets: A deep representation for volumetric shapes publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition – start-page: 1097 year: 2012 ident: ref18 article-title: Imagenet classification with deep convolutional neural networks publication-title: Advances in neural information processing systems – year: 2020 ident: ref14 publication-title: Ibm augmented remote assist – year: 2019 ident: ref9 publication-title: Google arcore sdk – year: 2015 ident: ref7 publication-title: Keras – year: 2014 ident: ref6 publication-title: Bird species categorization using pose normalized deep convolutional nets – ident: ref10 doi: 10.1109/ISMAR.2010.5643554 – year: 2004 ident: ref21 publication-title: Method and Apparatus for Identifying Scale Invariant Features in An Image and Use of Same for Locating An Object in An Image – ident: ref28 doi: 10.1109/ICRA.2011.5980567 – year: 2018 ident: ref5 publication-title: Apple Developer Documentation – ident: ref29 doi: 10.1109/ICCV.2017.74 – ident: ref22 doi: 10.1007/s11263-016-0911-8 – ident: ref27 doi: 10.1109/ICCV.2011.6126544 – year: 2014 ident: ref30 publication-title: Very Deep Convolutional Networks for Large-scale Image Recognition – volume: 1 start-page: 2 year: 0 ident: ref3 article-title: Augmented reality meets deep learning for car instance segmentation in urban scenes publication-title: British Machine Vision Conference – year: 2018 ident: ref2 publication-title: Deep Learning using Rectified Linear Units (ReLU) – ident: ref17 doi: 10.1109/CVPR.2015.7299194 – ident: ref23 doi: 10.1109/ICIAP.2003.1234043 – ident: ref12 doi: 10.1162/neco.1997.9.8.1735 – start-page: 756 year: 0 ident: ref39 article-title: Pose-assisted active visual recognition in mobile augmented reality publication-title: Proceedings of the 24th Annual International Conference on Mobile Computing and Networking – start-page: 265 year: 0 ident: ref1 article-title: Tensorflow: A system for large-scale machine learning publication-title: 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16) – ident: ref4 doi: 10.1145/2501988.2502045 – ident: ref13 doi: 10.1109/CVPR.2016.132 – ident: ref20 doi: 10.1109/ICCV.2015.170 – ident: ref36 doi: 10.1007/978-3-319-10590-1_54 – ident: ref11 doi: 10.1109/CVPR.2016.90 – ident: ref24 doi: 10.1016/j.aei.2015.10.005 – ident: ref33 doi: 10.1109/CoASE.2014.6899343 – year: 0 ident: ref35 article-title: Spatial temporal graph convolutional networks for skeleton-based action recognition publication-title: Thirty-Second AAAI Conference on Artificial Intelligence – ident: ref15 doi: 10.1109/CVPR.2016.414 – ident: ref40 doi: 10.1007/s12008-013-0199-7 – ident: ref26 doi: 10.1109/TPAMI.2016.2577031 – ident: ref25 doi: 10.1109/ISMAR.2012.6402562 – ident: ref31 doi: 10.1109/ICCV.2015.114 |
| SSID | ssj0014489 |
| Score | 2.4579344 |
| Snippet | Augmented Reality is increasingly explored as the new medium for two-way remote collaboration applications to guide the participants more effectively and... |
| SourceID | proquest crossref ieee |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 3514 |
| SubjectTerms | Algorithms Applications programs Augmented reality Cameras Collaboration Image recognition Maintenance engineering mobile Mobile computing Object recognition Task complexity Technical services Three-dimensional displays Visual recognition Visualization |
| Title | Fine-Grained Visual Recognition in Mobile Augmented Reality for Technical Support |
| URI | https://ieeexplore.ieee.org/document/9199568 https://www.proquest.com/docview/2460159052 https://www.proquest.com/docview/2444384694 |
| Volume | 26 |
| WOSCitedRecordID | wos000589217900014&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE/IET Electronic Library customDbUrl: eissn: 1941-0506 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0014489 issn: 1077-2626 databaseCode: RIE dateStart: 19950101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3dS8MwED9UfNAHv6Y4nRLBJ7Hapl2TPA5x-uJQUdlbyVdlIK1sq3-_l7QriiL4VkjSlrtL8rvc5X4Ap0wxrvtCBnFIJTooXAWcJzqQMje4_DEjpfRkE2w04uOxuF-C8_YujLXWJ5_ZC_foY_mm1JU7KrsU7j5xypdhmTFW39VqIwboZog6v5AFFFF6E8GMQnH59HJ1g54gRQfV1UuPHVtNTEWCYIh-2448v8qPRdnvNMPN__3jFmw0iJIMahPYhiVb7MD6lzqDHXgY4nNw49ggrCEvk1mFAx4XqUNlQSYFuSsVLhBkUL36Mp0G2z1EJ4hqiT-Ad-okjgUUEfsuPA-vn65ug4ZLIdAI0OZBmqPwI62tQNNJjFA2ogrBSN6PJYsUpVIkJpeIn6hWOTOowzjNTWyNKxhneLwHK0VZ2H0glGomTMhEZHH7T6mSkodGqcia1FghuhAuRJrpptC447t4y7zDEYrMKSRzCskahXThrB3yXlfZ-Ktzx4m97dhIvAu9hd6yZh7OMpqgw9kXYZ924aRtxhnkwiKysGXl-iRJjDBMJAe_v_kQ1tz36ySWHqzMp5U9glX9MZ_MpsdojGN-7I3xE5tL2lw |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1ZS-RAEC68QH3wFme9WvBJjCadTjr9KOKoqIPKKL6FvrIMLIk4E3__VncywWUXYd8a-qCp6uOrrur6AI654plOhAzikEo0UDIVZBnTgZSFweOPGymlJ5vgg0H29iYeZ-C0-wtjrfXBZ_bMFb0v31S6dk9l58L9J06zWZhPGKNR81ur8xmgoSGaCEMeUMTprQ8zCsX58PXyGm1Biiaqy5geO76amAqGcIj-cSF5hpW_jmV_1_RX_2-Wa7DSYkpy0SyCdZix5QYsf8k0uAlPfSwH144PwhryOhrX2OF5GjxUlWRUkodK4RFBLuqfPlGnwXoP0gniWuKf4J1CieMBRcy-BS_9q-HlTdCyKQQaIdokSAsUf6S1Fbh4mBHKRlQhHCmSWPJIUSoFM4VEBEW1KrhBLcZpYWJrXMo4k8XbMFdWpd0BQqnmwoRcRBYBQEqVlFlolIqsSY0VogfhVKS5blONO8aLX7k3OUKRO4XkTiF5q5AenHRd3ps8G9813nRi7xq2Eu_B3lRvebsTxzllaHImIkxoD466atxDzjEiS1vVrg1jMQIxwX78e-RDWLwZPtzn97eDu11YcnNpQlr2YG7yUdt9WNCfk9H448Avyd_5Bdy7 |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Fine-Grained+Visual+Recognition+in+Mobile+Augmented+Reality+for+Technical+Support&rft.jtitle=IEEE+transactions+on+visualization+and+computer+graphics&rft.au=Zhou%2C+Bing&rft.au=Guven%2C+Sinem&rft.date=2020-12-01&rft.issn=1941-0506&rft.eissn=1941-0506&rft.volume=26&rft.issue=12&rft.spage=3514&rft_id=info:doi/10.1109%2FTVCG.2020.3023635&rft.externalDBID=NO_FULL_TEXT |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1077-2626&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1077-2626&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1077-2626&client=summon |