Reinforcement learning (RL), or learning from the outcome of behavior, is a method for learning the expected values of those behaviors for use in value-based decision making. To maintain accurate expectations in a constantly changing world, constant learning is required. Optimally, this learning rate must balance compensating for uncertainty about the true state of the world with irreducible uncertainty, like sensory noise. Both humans and other mammals are known to adjust their learning rates in different environments. To quantify the accuracy of these learning rates for the environment or to compare between species or between observed behavior and a neurally measured learning rate, the learning rate and performance of an ideal decision maker—the optimal policy—must be known for the environment.
I develop a set of tasks with a known optimal policy and show that humans and non-human primates adjust their learning rates in the direction appropriate to the true underlaying rate of change in the world. However, while non-human primates are nearly optimal, humans appear to employ cognitive strategies that degrade their ability to set an appropriate learning rate. Requiring humans to perform a simultaneous working-memory task moves their performance closer to the optimal. We then develop a task optimized for functional magnetic resonance imaging that requires constant learning to look for evidence in humans that the behaviorally measured learning rate is reflected in the learning rate estimated from the BOLD response measured in dopamine target regions—a psychometric–neurometric match. We also ask if pupil dilation, a physiological measure thought to be correlated with both tracking uncertainty and norepinephrine activity, is correlated with estimates of behaviorally measured changes in learning rate and estimates of the current amount of uncertainty faced by the subject in the task. We find that while neurally measured learning rates are difficult to match to ones derived behaviorally, changes in pupil dilation reflect the tracking of uncertainty about the environment.
|Advisor:||Glimcher, Paul W.|
|Commitee:||Daw, Nathaniel D., Delgado, Mauricio R., Heeger, David J., Landy, Michael S.|
|School:||New York University|
|School Location:||United States -- New York|
|Source:||DAI-B 74/04(E), Dissertation Abstracts International|
|Subjects:||Neurosciences, Economics, Cognitive psychology|
|Keywords:||Bayesian modeling, Computational neuroscience, Dopamine, Fmri, Optimal coding, Reinforcement learning|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be