Daw, N. D., Niv, Y., & Dayan, P. D.
(2006). Actions, Policies, Values, and the Basal Ganglia
. In E. Bezard (Ed.)
, Recent Breakthroughs in Basal Ganglia Research
(pp. 111–130) . Nova Science Publishers Inc. PDF
Niv, Y., Daw, N. D., & Dayan, P. D.
(2006). Choice values
. Nature Neuroscience
(8), 987–988. PDFAbstract
Dopaminergic neurons are thought to inform decisions by reporting errors in reward prediction. A new study reports dopaminergic responses as monkeys make choices, supporting one computational theory of appetitive learning.
Dayan, P. D., Niv, Y., Seymour, B., & Daw, N. D.
(2006). The misbehavior of value and the discipline of the will
. Neural Networks
(8), 1153–1160. PDFAbstract
Most reinforcement learning models of animal conditioning operate under the convenient, though fictive, assumption that Pavlovian conditioning concerns prediction learning whereas instrumental conditioning concerns action learning. However, it is only through Pavlovian responses that Pavlovian prediction learning is evident, and these responses can act against the instrumental interests of the subjects. This can be seen in both experimental and natural circumstances. In this paper we study the consequences of importing this competition into a reinforcement learning context, and demonstrate the resulting effects in an omission schedule and a maze navigation task. The misbehavior created by Pavlovian values can be quite debilitating; we discuss how it may be disciplined. ©2006 Elsevier Ltd. All rights reserved.
Niv, Y., Joel, D., & Dayan, P. D.
(2006). A normative perspective on motivation
. Trends in Cognitive Science
(8), 375–381. PDFAbstract
Understanding the effects of motivation on instrumental action selection, and specifically on its two main forms, goal-directed and habitual control, is fundamental to the study of decision making. Motivational states have been shown to 'direct' goal-directed behavior rather straightforwardly towards more valuable outcomes. However, how motivational states can influence outcome-insensitive habitual behavior is more mysterious. We adopt a normative perspective, assuming that animals seek to maximize the utilities they achieve, and viewing motivation as a mapping from outcomes to utilities. We suggest that habitual action selection can direct responding properly only in motivational states which pertained during behavioral training. However, in novel states, we propose that outcome-independent, global effects of the utilities can 'energize' habitual actions.