Dissertation/Thesis Abstract

Choice and Reinforcement Learning in Dynamic Environments: Comparing Human and Animal Behavior to Neural Activity and Optimal Models
by DeWitt, Eric Edward James, Ph.D., New York University, 2012, 219; 3546392
Abstract (Summary)

Reinforcement learning (RL), or learning from the outcome of behavior, is a method for learning the expected values of those behaviors for use in value-based decision making. To maintain accurate expectations in a constantly changing world, constant learning is required. Optimally, this learning rate must balance compensating for uncertainty about the true state of the world with irreducible uncertainty, like sensory noise. Both humans and other mammals are known to adjust their learning rates in different environments. To quantify the accuracy of these learning rates for the environment or to compare between species or between observed behavior and a neurally measured learning rate, the learning rate and performance of an ideal decision maker—the optimal policy—must be known for the environment.

I develop a set of tasks with a known optimal policy and show that humans and non-human primates adjust their learning rates in the direction appropriate to the true underlaying rate of change in the world. However, while non-human primates are nearly optimal, humans appear to employ cognitive strategies that degrade their ability to set an appropriate learning rate. Requiring humans to perform a simultaneous working-memory task moves their performance closer to the optimal. We then develop a task optimized for functional magnetic resonance imaging that requires constant learning to look for evidence in humans that the behaviorally measured learning rate is reflected in the learning rate estimated from the BOLD response measured in dopamine target regions—a psychometric–neurometric match. We also ask if pupil dilation, a physiological measure thought to be correlated with both tracking uncertainty and norepinephrine activity, is correlated with estimates of behaviorally measured changes in learning rate and estimates of the current amount of uncertainty faced by the subject in the task. We find that while neurally measured learning rates are difficult to match to ones derived behaviorally, changes in pupil dilation reflect the tracking of uncertainty about the environment.

Indexing (document details)
Advisor: Glimcher, Paul W.
Commitee: Daw, Nathaniel D., Delgado, Mauricio R., Heeger, David J., Landy, Michael S.
School: New York University
Department: Psychology
School Location: United States -- New York
Source: DAI-B 74/04(E), Dissertation Abstracts International
Source Type: DISSERTATION
Subjects: Neurosciences, Economics, Cognitive psychology
Keywords: Bayesian modeling, Computational neuroscience, Dopamine, Fmri, Optimal coding, Reinforcement learning
Publication Number: 3546392
ISBN: 9781267799487
Copyright © 2019 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy
ProQuest