Deconstructing the human algorithms for exploration

•Exploration algorithms can be distinguished in terms of the bias and slope of choice functions.•Two experiments show evidence for both directed and random exploration.•A hybrid algorithm provides the best quantitative model of the choice data. The dilemma between information gathering (exploration)...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Cognition Jg. 173; S. 34 - 42
1. Verfasser:	Gershman, Samuel J.
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	Netherlands Elsevier B.V 01.04.2018 Elsevier Science Ltd
Schlagworte:	Algorithms Bayesian analysis Bayesian inference Bias Change agents Changes Cognition & reasoning Cognitive ability Cognitive psychology Computer applications Experiments Exploitation Exploration Explore-exploit dilemma Information gathering Learning Mathematical models Reinforcement Reinforcement learning Response bias Sampling Uncertainty Bayesian inference Explore-exploit dilemma Reinforcement learning
ISSN:	0010-0277, 1873-7838, 1873-7838
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	•Exploration algorithms can be distinguished in terms of the bias and slope of choice functions.•Two experiments show evidence for both directed and random exploration.•A hybrid algorithm provides the best quantitative model of the choice data. The dilemma between information gathering (exploration) and reward seeking (exploitation) is a fundamental problem for reinforcement learning agents. How humans resolve this dilemma is still an open question, because experiments have provided equivocal evidence about the underlying algorithms used by humans. We show that two families of algorithms can be distinguished in terms of how uncertainty affects exploration. Algorithms based on uncertainty bonuses predict a change in response bias as a function of uncertainty, whereas algorithms based on sampling predict a change in response slope. Two experiments provide evidence for both bias and slope changes, and computational modeling confirms that a hybrid model is the best quantitative account of the data.
Bibliographie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	0010-0277 1873-7838 1873-7838
DOI:	10.1016/j.cognition.2017.12.014