Modeling Variation in Human Feedback with User Inputs: An Exploratory Methodology

To expedite the development process of interactive reinforcement learning (IntRL) algorithms, prior work often uses perfect oracles as simulated human teachers to furnish feedback signals. These oracles typically derive from ground-truth knowledge or optimal policies, providing dense and error-free...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:2024 19th ACM/IEEE International Conference on Human-Robot Interaction (HRI) S. 303 - 312
Hauptverfasser: Huang, Jindan, Aronson, Reuben M., Short, Elaine Schaertl
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: ACM 11.03.2024
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract To expedite the development process of interactive reinforcement learning (IntRL) algorithms, prior work often uses perfect oracles as simulated human teachers to furnish feedback signals. These oracles typically derive from ground-truth knowledge or optimal policies, providing dense and error-free feedback to a robot learner without delay. However, this machine-like feedback behavior fails to accurately represent the diverse patterns observed in human feedback, which may lead to unstable or unexpected algorithm performance in real-world human-robot interaction. To alleviate this limitation of oracles in oversimplifying user behavior, we propose a method for modeling variation in human feedback that can be applied to a standard oracle. We present a model with 5 dimensions of feedback variation identified in prior work. This model enables the modification of feedback outputs from perfect oracles to introduce more human-like features. We demonstrate how each model attribute can impact on the learning performance of an IntRL algorithm through a simulation experiment. We also conduct a proof-of-concept study to illustrate how our model can be populated from people in two ways. The modeling results intuitively present the feedback variation among participants and help to explain the mismatch between oracles and human teachers. Overall, our method is a promising step towards refining simulated oracles by incorporating insights from real users.CCS CONCEPTS* Human-centered computing → Collaborative and social computing; * Computing methodologies → Modeling and simulation.
AbstractList To expedite the development process of interactive reinforcement learning (IntRL) algorithms, prior work often uses perfect oracles as simulated human teachers to furnish feedback signals. These oracles typically derive from ground-truth knowledge or optimal policies, providing dense and error-free feedback to a robot learner without delay. However, this machine-like feedback behavior fails to accurately represent the diverse patterns observed in human feedback, which may lead to unstable or unexpected algorithm performance in real-world human-robot interaction. To alleviate this limitation of oracles in oversimplifying user behavior, we propose a method for modeling variation in human feedback that can be applied to a standard oracle. We present a model with 5 dimensions of feedback variation identified in prior work. This model enables the modification of feedback outputs from perfect oracles to introduce more human-like features. We demonstrate how each model attribute can impact on the learning performance of an IntRL algorithm through a simulation experiment. We also conduct a proof-of-concept study to illustrate how our model can be populated from people in two ways. The modeling results intuitively present the feedback variation among participants and help to explain the mismatch between oracles and human teachers. Overall, our method is a promising step towards refining simulated oracles by incorporating insights from real users.CCS CONCEPTS* Human-centered computing → Collaborative and social computing; * Computing methodologies → Modeling and simulation.
Author Huang, Jindan
Aronson, Reuben M.
Short, Elaine Schaertl
Author_xml – sequence: 1
  givenname: Jindan
  surname: Huang
  fullname: Huang, Jindan
  email: jindan.huang@tufts.edu
  organization: Tufts University,Medford,Massachusetts,USA
– sequence: 2
  givenname: Reuben M.
  surname: Aronson
  fullname: Aronson, Reuben M.
  email: reuben.aronson@tufts.edu
  organization: Tufts University,Medford,Massachusetts,USA
– sequence: 3
  givenname: Elaine Schaertl
  surname: Short
  fullname: Short, Elaine Schaertl
  email: elaine.short@tufts.edu
  organization: Tufts University,Medford,Massachusetts,USA
BookMark eNotzkFLwzAYgOEICurs2YuH_IHO72uapvE2xuYGGyI4ryNpv27BLhlthvbfW9DTe3t479m1D54Ye0SYIubyWRQIWqmpKESuM3nFEq10mQMoEFkmb1nS986CzDVCLss79r4NNbXOH_in6ZyJLnjuPF9dTsbzJVFtTfXFv1088l1PHV_78yX2L3zm-eLn3IbOxNANfEvxGOrQhsPwwG4a0_aU_HfCdsvFx3yVbt5e1_PZJjVZqWNKBWhd2TJTFYHQKBtj9bguC6Wqum50hWCwIgtNXo-7JKS1EjJCWUhTopiwpz_XEdH-3LmT6YY9QjHCiOIXzLJPpA
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1145/3610977.3634925
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 9798400703225
EndPage 312
ExternalDocumentID 10660911
Genre orig-research
GrantInformation_xml – fundername: National Science Foundation
  funderid: 10.13039/100000001
GroupedDBID 6IE
6IL
ACM
ALMA_UNASSIGNED_HOLDINGS
APO
CBEJK
LHSKQ
RIE
RIL
ID FETCH-LOGICAL-a289t-e6099cb827ce03915fab99775677cddf9c10a1ceb0f4d491e35bb502e1565a813
IEDL.DBID RIE
ISICitedReferencesCount 4
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001239977500034&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Tue May 06 03:31:43 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a289t-e6099cb827ce03915fab99775677cddf9c10a1ceb0f4d491e35bb502e1565a813
OpenAccessLink https://dl.acm.org/doi/pdf/10.1145/3610977.3634925
PageCount 10
ParticipantIDs ieee_primary_10660911
PublicationCentury 2000
PublicationDate 2024-March-11
PublicationDateYYYYMMDD 2024-03-11
PublicationDate_xml – month: 03
  year: 2024
  text: 2024-March-11
  day: 11
PublicationDecade 2020
PublicationTitle 2024 19th ACM/IEEE International Conference on Human-Robot Interaction (HRI)
PublicationTitleAbbrev HRI
PublicationYear 2024
Publisher ACM
Publisher_xml – name: ACM
SSID ssib054910458
Score 1.8998029
Snippet To expedite the development process of interactive reinforcement learning (IntRL) algorithms, prior work often uses perfect oracles as simulated human teachers...
SourceID ieee
SourceType Publisher
StartPage 303
SubjectTerms Behavioral sciences
Computational modeling
human behavior modeling
Human computer interaction
human feedback
human-centered robotics
Human-robot interaction
interactive robot learning
Refining
Reinforcement learning
Social computing
Title Modeling Variation in Human Feedback with User Inputs: An Exploratory Methodology
URI https://ieeexplore.ieee.org/document/10660911
WOSCitedRecordID wos001239977500034&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NS8MwFA86PHhSseI3OXjtbJrmy5uIQ0HHBDd2G0n6CkPoxtb69_uSbooHD95KKDR5Sfo-f-9HyI03lbZZhTuAP_-0cDxLTajM4bmWEtBigIjjnryo4VBPp2a0AatHLAwAxOIz6IfHmMsvF74NoTK84VKifkNnZ1cp2YG1tocH_RwWkn6b9j2sELc8tBJXqs9lbMH3iz8lqo_BwT8_fEiSHyAeHX2rmCOyA_UxeQv8ZQFFTifo6EbJ0nlNYzieDvBVZ_0HDQFWOsYDRp_rZdus7-h9TbuKu5hYp6-ROzpG1RMyHjy-PzylG2aE1KKD1KSAszHe6VwFui_DRGWdwaUKqZQvy8p4llnmwWVVUaJggAvnRJYDemvCasZPSK9e1HBKKFoo3KocFwJQiBw03npAG1JoAG59eUaSII_Zsmt-MduK4vyP8Quyn6PeD2VajF2SXrNq4Yrs-c9mvl5dxy37AmHBl3c
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NS8MwFA8yBT2pOPHbHLx2NknTJN5EHBtuY8I2dhtp-gpD6MbW-vf7km2KBw_eSig0eUn6Pn_vR8iDM4W2cYE7gD__KMlEHBlfmSO4TlNAiwECjnvSU4OBnk7NcAtWD1gYAAjFZ9DyjyGXny9c7UNleMPTFPUbOjv7Mkl4vIFr7Y4PejrMp_22DXxYIh-FbyauVEukoQnfLwaVoEDax__89Alp_kDx6PBbyZySPSjPyLtnMPM4cjpBVzfIls5LGgLytI2vZtZ9UB9ipWM8YrRbLutq_USfS7qpuQupddoP7NEhrt4k4_br6KUTbbkRIosuUhUBzsa4THPlCb8Mk4XNDC5Vpkq5PC-MY7FlDrK4SHIUDAiZZTLmgP6atJqJc9IoFyVcEIo2irCK40IAEslB470HtCKlBhDW5Zek6eUxW27aX8x2orj6Y_yeHHZG_d6s1x28XZMjjlaAL9pi7IY0qlUNt-TAfVbz9eoubN8XcsSavg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2024+19th+ACM%2FIEEE+International+Conference+on+Human-Robot+Interaction+%28HRI%29&rft.atitle=Modeling+Variation+in+Human+Feedback+with+User+Inputs%3A+An+Exploratory+Methodology&rft.au=Huang%2C+Jindan&rft.au=Aronson%2C+Reuben+M.&rft.au=Short%2C+Elaine+Schaertl&rft.date=2024-03-11&rft.pub=ACM&rft.spage=303&rft.epage=312&rft_id=info:doi/10.1145%2F3610977.3634925&rft.externalDocID=10660911