Modeling Variation in Human Feedback with User Inputs: An Exploratory Methodology
To expedite the development process of interactive reinforcement learning (IntRL) algorithms, prior work often uses perfect oracles as simulated human teachers to furnish feedback signals. These oracles typically derive from ground-truth knowledge or optimal policies, providing dense and error-free...
Gespeichert in:
| Veröffentlicht in: | 2024 19th ACM/IEEE International Conference on Human-Robot Interaction (HRI) S. 303 - 312 |
|---|---|
| Hauptverfasser: | , , |
| Format: | Tagungsbericht |
| Sprache: | Englisch |
| Veröffentlicht: |
ACM
11.03.2024
|
| Schlagworte: | |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | To expedite the development process of interactive reinforcement learning (IntRL) algorithms, prior work often uses perfect oracles as simulated human teachers to furnish feedback signals. These oracles typically derive from ground-truth knowledge or optimal policies, providing dense and error-free feedback to a robot learner without delay. However, this machine-like feedback behavior fails to accurately represent the diverse patterns observed in human feedback, which may lead to unstable or unexpected algorithm performance in real-world human-robot interaction. To alleviate this limitation of oracles in oversimplifying user behavior, we propose a method for modeling variation in human feedback that can be applied to a standard oracle. We present a model with 5 dimensions of feedback variation identified in prior work. This model enables the modification of feedback outputs from perfect oracles to introduce more human-like features. We demonstrate how each model attribute can impact on the learning performance of an IntRL algorithm through a simulation experiment. We also conduct a proof-of-concept study to illustrate how our model can be populated from people in two ways. The modeling results intuitively present the feedback variation among participants and help to explain the mismatch between oracles and human teachers. Overall, our method is a promising step towards refining simulated oracles by incorporating insights from real users.CCS CONCEPTS* Human-centered computing → Collaborative and social computing; * Computing methodologies → Modeling and simulation. |
|---|---|
| AbstractList | To expedite the development process of interactive reinforcement learning (IntRL) algorithms, prior work often uses perfect oracles as simulated human teachers to furnish feedback signals. These oracles typically derive from ground-truth knowledge or optimal policies, providing dense and error-free feedback to a robot learner without delay. However, this machine-like feedback behavior fails to accurately represent the diverse patterns observed in human feedback, which may lead to unstable or unexpected algorithm performance in real-world human-robot interaction. To alleviate this limitation of oracles in oversimplifying user behavior, we propose a method for modeling variation in human feedback that can be applied to a standard oracle. We present a model with 5 dimensions of feedback variation identified in prior work. This model enables the modification of feedback outputs from perfect oracles to introduce more human-like features. We demonstrate how each model attribute can impact on the learning performance of an IntRL algorithm through a simulation experiment. We also conduct a proof-of-concept study to illustrate how our model can be populated from people in two ways. The modeling results intuitively present the feedback variation among participants and help to explain the mismatch between oracles and human teachers. Overall, our method is a promising step towards refining simulated oracles by incorporating insights from real users.CCS CONCEPTS* Human-centered computing → Collaborative and social computing; * Computing methodologies → Modeling and simulation. |
| Author | Huang, Jindan Aronson, Reuben M. Short, Elaine Schaertl |
| Author_xml | – sequence: 1 givenname: Jindan surname: Huang fullname: Huang, Jindan email: jindan.huang@tufts.edu organization: Tufts University,Medford,Massachusetts,USA – sequence: 2 givenname: Reuben M. surname: Aronson fullname: Aronson, Reuben M. email: reuben.aronson@tufts.edu organization: Tufts University,Medford,Massachusetts,USA – sequence: 3 givenname: Elaine Schaertl surname: Short fullname: Short, Elaine Schaertl email: elaine.short@tufts.edu organization: Tufts University,Medford,Massachusetts,USA |
| BookMark | eNotzkFLwzAYgOEICurs2YuH_IHO72uapvE2xuYGGyI4ryNpv27BLhlthvbfW9DTe3t479m1D54Ye0SYIubyWRQIWqmpKESuM3nFEq10mQMoEFkmb1nS986CzDVCLss79r4NNbXOH_in6ZyJLnjuPF9dTsbzJVFtTfXFv1088l1PHV_78yX2L3zm-eLn3IbOxNANfEvxGOrQhsPwwG4a0_aU_HfCdsvFx3yVbt5e1_PZJjVZqWNKBWhd2TJTFYHQKBtj9bguC6Wqum50hWCwIgtNXo-7JKS1EjJCWUhTopiwpz_XEdH-3LmT6YY9QjHCiOIXzLJPpA |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1145/3610977.3634925 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| EISBN | 9798400703225 |
| EndPage | 312 |
| ExternalDocumentID | 10660911 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: National Science Foundation funderid: 10.13039/100000001 |
| GroupedDBID | 6IE 6IL ACM ALMA_UNASSIGNED_HOLDINGS APO CBEJK LHSKQ RIE RIL |
| ID | FETCH-LOGICAL-a289t-e6099cb827ce03915fab99775677cddf9c10a1ceb0f4d491e35bb502e1565a813 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 4 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001239977500034&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Tue May 06 03:31:43 EDT 2025 |
| IsDoiOpenAccess | false |
| IsOpenAccess | true |
| IsPeerReviewed | false |
| IsScholarly | false |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a289t-e6099cb827ce03915fab99775677cddf9c10a1ceb0f4d491e35bb502e1565a813 |
| OpenAccessLink | https://dl.acm.org/doi/pdf/10.1145/3610977.3634925 |
| PageCount | 10 |
| ParticipantIDs | ieee_primary_10660911 |
| PublicationCentury | 2000 |
| PublicationDate | 2024-March-11 |
| PublicationDateYYYYMMDD | 2024-03-11 |
| PublicationDate_xml | – month: 03 year: 2024 text: 2024-March-11 day: 11 |
| PublicationDecade | 2020 |
| PublicationTitle | 2024 19th ACM/IEEE International Conference on Human-Robot Interaction (HRI) |
| PublicationTitleAbbrev | HRI |
| PublicationYear | 2024 |
| Publisher | ACM |
| Publisher_xml | – name: ACM |
| SSID | ssib054910458 |
| Score | 1.8998029 |
| Snippet | To expedite the development process of interactive reinforcement learning (IntRL) algorithms, prior work often uses perfect oracles as simulated human teachers... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 303 |
| SubjectTerms | Behavioral sciences Computational modeling human behavior modeling Human computer interaction human feedback human-centered robotics Human-robot interaction interactive robot learning Refining Reinforcement learning Social computing |
| Title | Modeling Variation in Human Feedback with User Inputs: An Exploratory Methodology |
| URI | https://ieeexplore.ieee.org/document/10660911 |
| WOSCitedRecordID | wos001239977500034&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NS8MwFA86PHhSseI3OXjtbJrmy5uIQ0HHBDd2G0n6CkPoxtb69_uSbooHD95KKDR5Sfo-f-9HyI03lbZZhTuAP_-0cDxLTajM4bmWEtBigIjjnryo4VBPp2a0AatHLAwAxOIz6IfHmMsvF74NoTK84VKifkNnZ1cp2YG1tocH_RwWkn6b9j2sELc8tBJXqs9lbMH3iz8lqo_BwT8_fEiSHyAeHX2rmCOyA_UxeQv8ZQFFTifo6EbJ0nlNYzieDvBVZ_0HDQFWOsYDRp_rZdus7-h9TbuKu5hYp6-ROzpG1RMyHjy-PzylG2aE1KKD1KSAszHe6VwFui_DRGWdwaUKqZQvy8p4llnmwWVVUaJggAvnRJYDemvCasZPSK9e1HBKKFoo3KocFwJQiBw03npAG1JoAG59eUaSII_Zsmt-MduK4vyP8Quyn6PeD2VajF2SXrNq4Yrs-c9mvl5dxy37AmHBl3c |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NS8MwFA8yBT2pOPHbHLx2NknTJN5EHBtuY8I2dhtp-gpD6MbW-vf7km2KBw_eSig0eUn6Pn_vR8iDM4W2cYE7gD__KMlEHBlfmSO4TlNAiwECjnvSU4OBnk7NcAtWD1gYAAjFZ9DyjyGXny9c7UNleMPTFPUbOjv7Mkl4vIFr7Y4PejrMp_22DXxYIh-FbyauVEukoQnfLwaVoEDax__89Alp_kDx6PBbyZySPSjPyLtnMPM4cjpBVzfIls5LGgLytI2vZtZ9UB9ipWM8YrRbLutq_USfS7qpuQupddoP7NEhrt4k4_br6KUTbbkRIosuUhUBzsa4THPlCb8Mk4XNDC5Vpkq5PC-MY7FlDrK4SHIUDAiZZTLmgP6atJqJc9IoFyVcEIo2irCK40IAEslB470HtCKlBhDW5Zek6eUxW27aX8x2orj6Y_yeHHZG_d6s1x28XZMjjlaAL9pi7IY0qlUNt-TAfVbz9eoubN8XcsSavg |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2024+19th+ACM%2FIEEE+International+Conference+on+Human-Robot+Interaction+%28HRI%29&rft.atitle=Modeling+Variation+in+Human+Feedback+with+User+Inputs%3A+An+Exploratory+Methodology&rft.au=Huang%2C+Jindan&rft.au=Aronson%2C+Reuben+M.&rft.au=Short%2C+Elaine+Schaertl&rft.date=2024-03-11&rft.pub=ACM&rft.spage=303&rft.epage=312&rft_id=info:doi/10.1145%2F3610977.3634925&rft.externalDocID=10660911 |