Modeling Variation in Human Feedback with User Inputs: An Exploratory Methodology

To expedite the development process of interactive reinforcement learning (IntRL) algorithms, prior work often uses perfect oracles as simulated human teachers to furnish feedback signals. These oracles typically derive from ground-truth knowledge or optimal policies, providing dense and error-free...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	2024 19th ACM/IEEE International Conference on Human-Robot Interaction (HRI) S. 303 - 312
Hauptverfasser:	Huang, Jindan, Aronson, Reuben M., Short, Elaine Schaertl
Format:	Tagungsbericht
Sprache:	Englisch
Veröffentlicht:	ACM 11.03.2024
Schlagworte:	Behavioral sciences Computational modeling human behavior modeling Human computer interaction human feedback human-centered robotics Human-robot interaction interactive robot learning Refining Reinforcement learning Social computing
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Abstract	To expedite the development process of interactive reinforcement learning (IntRL) algorithms, prior work often uses perfect oracles as simulated human teachers to furnish feedback signals. These oracles typically derive from ground-truth knowledge or optimal policies, providing dense and error-free feedback to a robot learner without delay. However, this machine-like feedback behavior fails to accurately represent the diverse patterns observed in human feedback, which may lead to unstable or unexpected algorithm performance in real-world human-robot interaction. To alleviate this limitation of oracles in oversimplifying user behavior, we propose a method for modeling variation in human feedback that can be applied to a standard oracle. We present a model with 5 dimensions of feedback variation identified in prior work. This model enables the modification of feedback outputs from perfect oracles to introduce more human-like features. We demonstrate how each model attribute can impact on the learning performance of an IntRL algorithm through a simulation experiment. We also conduct a proof-of-concept study to illustrate how our model can be populated from people in two ways. The modeling results intuitively present the feedback variation among participants and help to explain the mismatch between oracles and human teachers. Overall, our method is a promising step towards refining simulated oracles by incorporating insights from real users.CCS CONCEPTS* Human-centered computing → Collaborative and social computing; * Computing methodologies → Modeling and simulation.
AbstractList	To expedite the development process of interactive reinforcement learning (IntRL) algorithms, prior work often uses perfect oracles as simulated human teachers to furnish feedback signals. These oracles typically derive from ground-truth knowledge or optimal policies, providing dense and error-free feedback to a robot learner without delay. However, this machine-like feedback behavior fails to accurately represent the diverse patterns observed in human feedback, which may lead to unstable or unexpected algorithm performance in real-world human-robot interaction. To alleviate this limitation of oracles in oversimplifying user behavior, we propose a method for modeling variation in human feedback that can be applied to a standard oracle. We present a model with 5 dimensions of feedback variation identified in prior work. This model enables the modification of feedback outputs from perfect oracles to introduce more human-like features. We demonstrate how each model attribute can impact on the learning performance of an IntRL algorithm through a simulation experiment. We also conduct a proof-of-concept study to illustrate how our model can be populated from people in two ways. The modeling results intuitively present the feedback variation among participants and help to explain the mismatch between oracles and human teachers. Overall, our method is a promising step towards refining simulated oracles by incorporating insights from real users.CCS CONCEPTS* Human-centered computing → Collaborative and social computing; * Computing methodologies → Modeling and simulation.
Author	Huang, Jindan Aronson, Reuben M. Short, Elaine Schaertl
Author_xml	– sequence: 1 givenname: Jindan surname: Huang fullname: Huang, Jindan email: jindan.huang@tufts.edu organization: Tufts University,Medford,Massachusetts,USA – sequence: 2 givenname: Reuben M. surname: Aronson fullname: Aronson, Reuben M. email: reuben.aronson@tufts.edu organization: Tufts University,Medford,Massachusetts,USA – sequence: 3 givenname: Elaine Schaertl surname: Short fullname: Short, Elaine Schaertl email: elaine.short@tufts.edu organization: Tufts University,Medford,Massachusetts,USA
BookMark	eNotzkFLwzAYgOEICurs2YuH_IHO72uapvE2xuYGGyI4ryNpv27BLhlthvbfW9DTe3t479m1D54Ye0SYIubyWRQIWqmpKESuM3nFEq10mQMoEFkmb1nS986CzDVCLss79r4NNbXOH_in6ZyJLnjuPF9dTsbzJVFtTfXFv1088l1PHV_78yX2L3zm-eLn3IbOxNANfEvxGOrQhsPwwG4a0_aU_HfCdsvFx3yVbt5e1_PZJjVZqWNKBWhd2TJTFYHQKBtj9bguC6Wqum50hWCwIgtNXo-7JKS1EjJCWUhTopiwpz_XEdH-3LmT6YY9QjHCiOIXzLJPpA
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.1145/3610977.3634925
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
EISBN	9798400703225
EndPage	312
ExternalDocumentID	10660911
Genre	orig-research
GrantInformation_xml	– fundername: National Science Foundation funderid: 10.13039/100000001
GroupedDBID	6IE 6IL ACM ALMA_UNASSIGNED_HOLDINGS APO CBEJK LHSKQ RIE RIL
ID	FETCH-LOGICAL-a289t-e6099cb827ce03915fab99775677cddf9c10a1ceb0f4d491e35bb502e1565a813
IEDL.DBID	RIE
ISICitedReferencesCount	4
ISICitedReferencesURI	http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001239977500034&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate	Tue May 06 03:31:43 EDT 2025
IsDoiOpenAccess	false
IsOpenAccess	true
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-a289t-e6099cb827ce03915fab99775677cddf9c10a1ceb0f4d491e35bb502e1565a813
OpenAccessLink	https://dl.acm.org/doi/pdf/10.1145/3610977.3634925
PageCount	10
ParticipantIDs	ieee_primary_10660911
PublicationCentury	2000
PublicationDate	2024-March-11
PublicationDateYYYYMMDD	2024-03-11
PublicationDate_xml	– month: 03 year: 2024 text: 2024-March-11 day: 11
PublicationDecade	2020
PublicationTitle	2024 19th ACM/IEEE International Conference on Human-Robot Interaction (HRI)
PublicationTitleAbbrev	HRI
PublicationYear	2024
Publisher	ACM
Publisher_xml	– name: ACM
SSID	ssib054910458
Score	1.8998029
Snippet	To expedite the development process of interactive reinforcement learning (IntRL) algorithms, prior work often uses perfect oracles as simulated human teachers...
SourceID	ieee
SourceType	Publisher
StartPage	303
SubjectTerms	Behavioral sciences Computational modeling human behavior modeling Human computer interaction human feedback human-centered robotics Human-robot interaction interactive robot learning Refining Reinforcement learning Social computing
Title	Modeling Variation in Human Feedback with User Inputs: An Exploratory Methodology
URI	https://ieeexplore.ieee.org/document/10660911
WOSCitedRecordID	wos001239977500034&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NS8MwFA86PHhSseI3OXjtbJrmy5uIQ0HHBDd2G0n6CkPoxtb69_uSbooHD95KKDR5Sfo-f-9HyI03lbZZhTuAP_-0cDxLTajM4bmWEtBigIjjnryo4VBPp2a0AatHLAwAxOIz6IfHmMsvF74NoTK84VKifkNnZ1cp2YG1tocH_RwWkn6b9j2sELc8tBJXqs9lbMH3iz8lqo_BwT8_fEiSHyAeHX2rmCOyA_UxeQv8ZQFFTifo6EbJ0nlNYzieDvBVZ_0HDQFWOsYDRp_rZdus7-h9TbuKu5hYp6-ROzpG1RMyHjy-PzylG2aE1KKD1KSAszHe6VwFui_DRGWdwaUKqZQvy8p4llnmwWVVUaJggAvnRJYDemvCasZPSK9e1HBKKFoo3KocFwJQiBw03npAG1JoAG59eUaSII_Zsmt-MduK4vyP8Quyn6PeD2VajF2SXrNq4Yrs-c9mvl5dxy37AmHBl3c
linkProvider	IEEE
linkToHtml	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NS8MwFA8yBT2pOPHbHLx2NknTJN5EHBtuY8I2dhtp-gpD6MbW-vf7km2KBw_eSig0eUn6Pn_vR8iDM4W2cYE7gD__KMlEHBlfmSO4TlNAiwECjnvSU4OBnk7NcAtWD1gYAAjFZ9DyjyGXny9c7UNleMPTFPUbOjv7Mkl4vIFr7Y4PejrMp_22DXxYIh-FbyauVEukoQnfLwaVoEDax__89Alp_kDx6PBbyZySPSjPyLtnMPM4cjpBVzfIls5LGgLytI2vZtZ9UB9ipWM8YrRbLutq_USfS7qpuQupddoP7NEhrt4k4_br6KUTbbkRIosuUhUBzsa4THPlCb8Mk4XNDC5Vpkq5PC-MY7FlDrK4SHIUDAiZZTLmgP6atJqJc9IoFyVcEIo2irCK40IAEslB470HtCKlBhDW5Zek6eUxW27aX8x2orj6Y_yeHHZG_d6s1x28XZMjjlaAL9pi7IY0qlUNt-TAfVbz9eoubN8XcsSavg
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2024+19th+ACM%2FIEEE+International+Conference+on+Human-Robot+Interaction+%28HRI%29&rft.atitle=Modeling+Variation+in+Human+Feedback+with+User+Inputs%3A+An+Exploratory+Methodology&rft.au=Huang%2C+Jindan&rft.au=Aronson%2C+Reuben+M.&rft.au=Short%2C+Elaine+Schaertl&rft.date=2024-03-11&rft.pub=ACM&rft.spage=303&rft.epage=312&rft_id=info:doi/10.1145%2F3610977.3634925&rft.externalDocID=10660911