Fine-Grained Visual Recognition in Mobile Augmented Reality for Technical Support

Augmented Reality is increasingly explored as the new medium for two-way remote collaboration applications to guide the participants more effectively and efficiently via visual instructions. As users strive for more natural interaction and automation in augmented reality applications, new visual rec...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE transactions on visualization and computer graphics Ročník 26; číslo 12; s. 3514 - 3523
Hlavní autoři: Zhou, Bing, Guven, Sinem
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York IEEE 01.12.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:1077-2626, 1941-0506, 1941-0506
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Augmented Reality is increasingly explored as the new medium for two-way remote collaboration applications to guide the participants more effectively and efficiently via visual instructions. As users strive for more natural interaction and automation in augmented reality applications, new visual recognition techniques are needed to enhance the user experience. Although simple object recognition is often used in augmented reality towards this goal, most collaboration tasks are too complex for such recognition algorithms to suffice. In this paper, we propose a fine-grained visual recognition approach for mobile augmented reality, which leverages RGB video frames and sparse depth feature points identified in real-time, as well as camera pose data to detect various visual states of an object. We demonstrate the value of our approach through a mobile application designed for hardware support, which automatically detects the state of an object to present the right set of information in the right context.
AbstractList Augmented Reality is increasingly explored as the new medium for two-way remote collaboration applications to guide the participants more effectively and efficiently via visual instructions. As users strive for more natural interaction and automation in augmented reality applications, new visual recognition techniques are needed to enhance the user experience. Although simple object recognition is often used in augmented reality towards this goal, most collaboration tasks are too complex for such recognition algorithms to suffice. In this paper, we propose a fine-grained visual recognition approach for mobile augmented reality, which leverages RGB video frames and sparse depth feature points identified in real-time, as well as camera pose data to detect various visual states of an object. We demonstrate the value of our approach through a mobile application designed for hardware support, which automatically detects the state of an object to present the right set of information in the right context.
Augmented Reality is increasingly explored as the new medium for two-way remote collaboration applications to guide the participants more effectively and efficiently via visual instructions. As users strive for more natural interaction and automation in augmented reality applications, new visual recognition techniques are needed to enhance the user experience. Although simple object recognition is often used in augmented reality towards this goal, most collaboration tasks are too complex for such recognition algorithms to suffice. In this paper, we propose a fine-grained visual recognition approach for mobile augmented reality, which leverages RGB video frames and sparse depth feature points identified in real-time, as well as camera pose data to detect various visual states of an object. We demonstrate the value of our approach through a mobile application designed for hardware support, which automatically detects the state of an object to present the right set of information in the right context.Augmented Reality is increasingly explored as the new medium for two-way remote collaboration applications to guide the participants more effectively and efficiently via visual instructions. As users strive for more natural interaction and automation in augmented reality applications, new visual recognition techniques are needed to enhance the user experience. Although simple object recognition is often used in augmented reality towards this goal, most collaboration tasks are too complex for such recognition algorithms to suffice. In this paper, we propose a fine-grained visual recognition approach for mobile augmented reality, which leverages RGB video frames and sparse depth feature points identified in real-time, as well as camera pose data to detect various visual states of an object. We demonstrate the value of our approach through a mobile application designed for hardware support, which automatically detects the state of an object to present the right set of information in the right context.
Author Guven, Sinem
Zhou, Bing
Author_xml – sequence: 1
  givenname: Bing
  surname: Zhou
  fullname: Zhou, Bing
  email: bing.zhou@ibm.com
  organization: IBM T. J. Watson Research Center, Yorktown Heights, New York, United States
– sequence: 2
  givenname: Sinem
  surname: Guven
  fullname: Guven, Sinem
  email: sguven@us.ibm.com
  organization: IBM T. J. Watson Research Center, Yorktown Heights, New York, United States
BookMark eNp9kLtOwzAUQC0E4v0BiCUSC0uK36lHVEFBAiGgsEaOfQ1GqV3sZODvcVXEwMB0PZxj-54DtB1iAIROCJ4QgtXF4nU2n1BM8YRhyiQTW2ifKE5qLLDcLmfcNDWVVO6hg5w_MCacT9Uu2mO0UETQffR47QPU86TLsNWrz6Puqycw8S34wcdQ-VDdx873UF2Ob0sIQ8GeQPd--KpcTNUCzHvwpljP42oV03CEdpzuMxz_zEP0cn21mN3Udw_z29nlXW0YlUMtnXaWGAMKtOVWdUBox5lygumGdJRqxa3TQhBqOtfYpmuYdJaBVUpwO2WH6Hxz7yrFzxHy0C59NtD3OkAcc0s552zKpeIFPfuDfsQxhfK7QklMhMKCFopsKJNizglcu0p-qdNXS3C77t2ue7fr3u1P7-I0fxzjB70ON5Sk_b_m6cb0APD7kiJlOzll38Vmjak
CODEN ITVGEA
CitedBy_id crossref_primary_10_1109_TVCG_2024_3456164
crossref_primary_10_1109_TVCG_2022_3203104
crossref_primary_10_1007_s11227_022_04490_8
crossref_primary_10_1109_LRA_2024_3468157
crossref_primary_10_1109_TVCG_2022_3203111
crossref_primary_10_1109_TNSE_2022_3155177
crossref_primary_10_1016_j_aei_2024_102857
crossref_primary_10_3390_computers11080124
Cites_doi 10.1109/ICCV.2017.557
10.1109/CVPR.2009.5206848
10.1109/CVPR.2016.128
10.1109/ISMAR-Adjunct.2019.00-42
10.1109/ISMAR.2010.5643554
10.1109/ICRA.2011.5980567
10.1109/ICCV.2017.74
10.1007/s11263-016-0911-8
10.1109/ICCV.2011.6126544
10.1109/CVPR.2015.7299194
10.1109/ICIAP.2003.1234043
10.1162/neco.1997.9.8.1735
10.1145/2501988.2502045
10.1109/CVPR.2016.132
10.1109/ICCV.2015.170
10.1007/978-3-319-10590-1_54
10.1109/CVPR.2016.90
10.1016/j.aei.2015.10.005
10.1109/CoASE.2014.6899343
10.1109/CVPR.2016.414
10.1007/s12008-013-0199-7
10.1109/TPAMI.2016.2577031
10.1109/ISMAR.2012.6402562
10.1109/ICCV.2015.114
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
7X8
DOI 10.1109/TVCG.2020.3023635
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE/IET Electronic Library
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
MEDLINE - Academic
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
MEDLINE - Academic
DatabaseTitleList Technology Research Database
MEDLINE - Academic

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE/IET Electronic Library
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
– sequence: 2
  dbid: 7X8
  name: MEDLINE - Academic
  url: https://search.proquest.com/medline
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1941-0506
EndPage 3523
ExternalDocumentID 10_1109_TVCG_2020_3023635
9199568
Genre orig-research
GroupedDBID ---
-~X
.DC
0R~
29I
4.4
53G
5GY
5VS
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFO
ACIWK
AENEX
AETIX
AGQYO
AGSQL
AHBIQ
AI.
AIBXA
AKJIK
AKQYR
ALLEH
ALMA_UNASSIGNED_HOLDINGS
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
F5P
HZ~
H~9
IEDLZ
IFIPE
IFJZH
IPLJI
JAVBF
LAI
M43
O9-
OCL
P2P
PQQKQ
RIA
RIE
RNI
RNS
RZB
TN5
VH1
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
7X8
ID FETCH-LOGICAL-c326t-6fafd1cce9ead4d9be12b439f53a71b22a94dfa5512cbf7d7b736fd3ed9954d83
IEDL.DBID RIE
ISICitedReferencesCount 20
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000589217900014&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1077-2626
1941-0506
IngestDate Thu Oct 02 10:30:40 EDT 2025
Sun Nov 30 03:59:52 EST 2025
Sat Nov 29 06:05:43 EST 2025
Tue Nov 18 22:18:28 EST 2025
Wed Aug 27 02:28:32 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 12
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c326t-6fafd1cce9ead4d9be12b439f53a71b22a94dfa5512cbf7d7b736fd3ed9954d83
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
PMID 32941152
PQID 2460159052
PQPubID 75741
PageCount 10
ParticipantIDs proquest_miscellaneous_2444384694
crossref_citationtrail_10_1109_TVCG_2020_3023635
crossref_primary_10_1109_TVCG_2020_3023635
proquest_journals_2460159052
ieee_primary_9199568
PublicationCentury 2000
PublicationDate 2020-12-01
PublicationDateYYYYMMDD 2020-12-01
PublicationDate_xml – month: 12
  year: 2020
  text: 2020-12-01
  day: 01
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on visualization and computer graphics
PublicationTitleAbbrev TVCG
PublicationYear 2020
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref12
ref37
ref15
ref36
ref31
(ref5) 2018
ref33
ref11
kanezaki (ref16) 2016
ref32
ref10
lecun (ref19) 1995; 3361
ref17
ref38
lowe (ref21) 2004
alhaija (ref3) 0; 1
agarap (ref2) 2018
zhou (ref39) 0
(ref9) 2019
ref24
yan (ref35) 0
ref23
wu (ref34) 0
ref26
simonyan (ref30) 2014
ref25
ref20
ref22
(ref14) 2020
abadi (ref1) 0
ref28
ref27
krizhevsky (ref18) 2012
ref29
ref8
chollet (ref7) 2015
ref4
branson (ref6) 2014
ref40
References_xml – ident: ref38
  doi: 10.1109/ICCV.2017.557
– volume: 3361
  start-page: 1995
  year: 1995
  ident: ref19
  article-title: Convolutional networks for images, speech, and time series
  publication-title: The Handbook of Brain Theory and Neural Networks
– ident: ref8
  doi: 10.1109/CVPR.2009.5206848
– year: 2016
  ident: ref16
  publication-title: RotationNet Joint Object Categorization and Pose Estimation Using Multiviews from Unsupervised Viewpoints
– ident: ref37
  doi: 10.1109/CVPR.2016.128
– ident: ref32
  doi: 10.1109/ISMAR-Adjunct.2019.00-42
– start-page: 1912
  year: 0
  ident: ref34
  article-title: 3d shapenets: A deep representation for volumetric shapes
  publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
– start-page: 1097
  year: 2012
  ident: ref18
  article-title: Imagenet classification with deep convolutional neural networks
  publication-title: Advances in neural information processing systems
– year: 2020
  ident: ref14
  publication-title: Ibm augmented remote assist
– year: 2019
  ident: ref9
  publication-title: Google arcore sdk
– year: 2015
  ident: ref7
  publication-title: Keras
– year: 2014
  ident: ref6
  publication-title: Bird species categorization using pose normalized deep convolutional nets
– ident: ref10
  doi: 10.1109/ISMAR.2010.5643554
– year: 2004
  ident: ref21
  publication-title: Method and Apparatus for Identifying Scale Invariant Features in An Image and Use of Same for Locating An Object in An Image
– ident: ref28
  doi: 10.1109/ICRA.2011.5980567
– year: 2018
  ident: ref5
  publication-title: Apple Developer Documentation
– ident: ref29
  doi: 10.1109/ICCV.2017.74
– ident: ref22
  doi: 10.1007/s11263-016-0911-8
– ident: ref27
  doi: 10.1109/ICCV.2011.6126544
– year: 2014
  ident: ref30
  publication-title: Very Deep Convolutional Networks for Large-scale Image Recognition
– volume: 1
  start-page: 2
  year: 0
  ident: ref3
  article-title: Augmented reality meets deep learning for car instance segmentation in urban scenes
  publication-title: British Machine Vision Conference
– year: 2018
  ident: ref2
  publication-title: Deep Learning using Rectified Linear Units (ReLU)
– ident: ref17
  doi: 10.1109/CVPR.2015.7299194
– ident: ref23
  doi: 10.1109/ICIAP.2003.1234043
– ident: ref12
  doi: 10.1162/neco.1997.9.8.1735
– start-page: 756
  year: 0
  ident: ref39
  article-title: Pose-assisted active visual recognition in mobile augmented reality
  publication-title: Proceedings of the 24th Annual International Conference on Mobile Computing and Networking
– start-page: 265
  year: 0
  ident: ref1
  article-title: Tensorflow: A system for large-scale machine learning
  publication-title: 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16)
– ident: ref4
  doi: 10.1145/2501988.2502045
– ident: ref13
  doi: 10.1109/CVPR.2016.132
– ident: ref20
  doi: 10.1109/ICCV.2015.170
– ident: ref36
  doi: 10.1007/978-3-319-10590-1_54
– ident: ref11
  doi: 10.1109/CVPR.2016.90
– ident: ref24
  doi: 10.1016/j.aei.2015.10.005
– ident: ref33
  doi: 10.1109/CoASE.2014.6899343
– year: 0
  ident: ref35
  article-title: Spatial temporal graph convolutional networks for skeleton-based action recognition
  publication-title: Thirty-Second AAAI Conference on Artificial Intelligence
– ident: ref15
  doi: 10.1109/CVPR.2016.414
– ident: ref40
  doi: 10.1007/s12008-013-0199-7
– ident: ref26
  doi: 10.1109/TPAMI.2016.2577031
– ident: ref25
  doi: 10.1109/ISMAR.2012.6402562
– ident: ref31
  doi: 10.1109/ICCV.2015.114
SSID ssj0014489
Score 2.4579344
Snippet Augmented Reality is increasingly explored as the new medium for two-way remote collaboration applications to guide the participants more effectively and...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 3514
SubjectTerms Algorithms
Applications programs
Augmented reality
Cameras
Collaboration
Image recognition
Maintenance engineering
mobile
Mobile computing
Object recognition
Task complexity
Technical services
Three-dimensional displays
Visual recognition
Visualization
Title Fine-Grained Visual Recognition in Mobile Augmented Reality for Technical Support
URI https://ieeexplore.ieee.org/document/9199568
https://www.proquest.com/docview/2460159052
https://www.proquest.com/docview/2444384694
Volume 26
WOSCitedRecordID wos000589217900014&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE/IET Electronic Library
  customDbUrl:
  eissn: 1941-0506
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0014489
  issn: 1077-2626
  databaseCode: RIE
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3dS8MwED9UfNAHv6Y4nRLBJ7Hapl2TPA5x-uJQUdlbyVdlIK1sq3-_l7QriiL4VkjSlrtL8rvc5X4Ap0wxrvtCBnFIJTooXAWcJzqQMje4_DEjpfRkE2w04uOxuF-C8_YujLXWJ5_ZC_foY_mm1JU7KrsU7j5xypdhmTFW39VqIwboZog6v5AFFFF6E8GMQnH59HJ1g54gRQfV1UuPHVtNTEWCYIh-2448v8qPRdnvNMPN__3jFmw0iJIMahPYhiVb7MD6lzqDHXgY4nNw49ggrCEvk1mFAx4XqUNlQSYFuSsVLhBkUL36Mp0G2z1EJ4hqiT-Ad-okjgUUEfsuPA-vn65ug4ZLIdAI0OZBmqPwI62tQNNJjFA2ogrBSN6PJYsUpVIkJpeIn6hWOTOowzjNTWyNKxhneLwHK0VZ2H0glGomTMhEZHH7T6mSkodGqcia1FghuhAuRJrpptC447t4y7zDEYrMKSRzCskahXThrB3yXlfZ-Ktzx4m97dhIvAu9hd6yZh7OMpqgw9kXYZ924aRtxhnkwiKysGXl-iRJjDBMJAe_v_kQ1tz36ySWHqzMp5U9glX9MZ_MpsdojGN-7I3xE5tL2lw
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1ZS-RAEC68QH3wFme9WvBJjCadTjr9KOKoqIPKKL6FvrIMLIk4E3__VncywWUXYd8a-qCp6uOrrur6AI654plOhAzikEo0UDIVZBnTgZSFweOPGymlJ5vgg0H29iYeZ-C0-wtjrfXBZ_bMFb0v31S6dk9l58L9J06zWZhPGKNR81ur8xmgoSGaCEMeUMTprQ8zCsX58PXyGm1Biiaqy5geO76amAqGcIj-cSF5hpW_jmV_1_RX_2-Wa7DSYkpy0SyCdZix5QYsf8k0uAlPfSwH144PwhryOhrX2OF5GjxUlWRUkodK4RFBLuqfPlGnwXoP0gniWuKf4J1CieMBRcy-BS_9q-HlTdCyKQQaIdokSAsUf6S1Fbh4mBHKRlQhHCmSWPJIUSoFM4VEBEW1KrhBLcZpYWJrXMo4k8XbMFdWpd0BQqnmwoRcRBYBQEqVlFlolIqsSY0VogfhVKS5blONO8aLX7k3OUKRO4XkTiF5q5AenHRd3ps8G9813nRi7xq2Eu_B3lRvebsTxzllaHImIkxoD466atxDzjEiS1vVrg1jMQIxwX78e-RDWLwZPtzn97eDu11YcnNpQlr2YG7yUdt9WNCfk9H448Avyd_5Bdy7
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Fine-Grained+Visual+Recognition+in+Mobile+Augmented+Reality+for+Technical+Support&rft.jtitle=IEEE+transactions+on+visualization+and+computer+graphics&rft.au=Zhou%2C+Bing&rft.au=Guven%2C+Sinem&rft.date=2020-12-01&rft.issn=1941-0506&rft.eissn=1941-0506&rft.volume=26&rft.issue=12&rft.spage=3514&rft_id=info:doi/10.1109%2FTVCG.2020.3023635&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1077-2626&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1077-2626&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1077-2626&client=summon