Publications by Year: 2018

2018
Niv, Y. (2018). Deep down, you are a scientist. In Think tank: Forty neuroscientists explore the biological roots of human experience. PDFAbstract
You may not know it, but deep down you are a scientist. To be precise, your brain is a scientist—and a good one, too: the kind of scientist that makes clear hypotheses, gathers data from several sources, and then reaches a well-founded conclusion. Although we are not aware of the scientific experimentation occurring in our brain on a momentary basis, the scientific process is fundamental to how our brain works. This scientific process involves three key components. First: hypotheses. Our brain makes hypotheses, or predictions, all the time. The second component of good scientific work is gathering data—testing the hypothesis by comparing it to evidence. The neuroscientists gather data to test the theories about how the brain works from several sources—for example, behavior, invasive recordings of the activity of single cells in the brain, and noninvasive imaging of overall activity in large areas of the brain. Finally, after making precise, well-founded predictions and gathering data from all available sources, a scientist must interpret the empirical observations. It is important to realize that the perceived reality is subjective—it is interpreted—rather than an objective image of the world out there. And in some cases this interpretation can break down. For instance, in schizophrenia, meaningless events and distractors can take on outsized meaning in subjective interpretation, leading to hallucinations, delusions, and paranoia. Our memories are similarly a reflection of our own interpretations rather than a true record of events. (PsycINFO Database Record (c) 2018 APA, all rights reserved)
Rouhani, N., Norman, K. A., & Niv, Y. (2018). Dissociable effects of surprising rewards on learning and memory. Journal of Experimental Psychology: Learning Memory and Cognition , 44 (9), 1430–1443. PDFAbstract

Reward-prediction errors track the extent to which rewards deviate from expectations, and aid in learning. How do such errors in prediction interact with memory for the rewarding episode? Existing findings point to both cooperative and competitive interactions between learning and memory mechanisms. Here, we investigated whether learning about rewards in a high-risk context, with frequent, large prediction errors, would give rise to higher fidelity memory traces for rewarding events than learning in a low-risk context. Experiment 1 showed that recognition was better for items associated with larger absolute prediction errors during reward learning. Larger prediction errors also led to higher rates of learning about rewards. Interestingly we did not find a relationship between learning rate for reward and recognition-memory accuracy for items, suggesting that these two effects of prediction errors were caused by separate underlying mechanisms. In Experiment 2, we replicated these results with a longer task that posed stronger memory demands and allowed for more learning. We also showed improved source and sequence memory for items within the high-risk context. In Experiment 3, we controlled for the difficulty of reward learning in the risk environments, again replicating the previous results. Moreover, this control revealed that the high-risk context enhanced item-recognition memory beyond the effect of prediction errors. In summary, our results show that prediction errors boost both episodic item memory and incremental reward learning, but the two effects are likely mediated by distinct underlying systems.

Sharpe, M. J., Chang, C. Y., Liu, M. A., Batchelor, H. M., Mueller, L. E., Jones, J. L., Niv, Y., et al. (2018). Dopamine transients are sufficient and necessary for acquisition of model-based associations. Nature Neuroscience , 21 (10), 1493. PDFAbstract
Learning to predict reward is thought to be driven by dopaminergic prediction errors, which reflect discrepancies between actual and expected value. Here the authors show that learning to predict neutral events is also driven by prediction errors and that such value-neutral associative learning is also likely mediated by dopaminergic error signals.
Sharpe, M. J., Stalnaker, T. A., Schuck, N. W., Killcross, S., Schoenbaum, G., & Niv, Y. (2018). An Integrated Model of Action Selection: Distinct Modes of Cortical Control of Striatal Decision Making. Annual Review of Psychology. PDFAbstract
Making decisions in environments with few choice options is easy. We select the action that results in the most valued outcome. Making decisions in more complex environments, where the same action can produce different outcomes in different conditions, is much harder. In such circumstances, we propose that accurate action selection relies on top-down control from the prelimbic and orbitofrontal cortices over striatal activity through distinct thalamostriatal circuits. We suggest that the prelimbic cortex exerts direct influence over medium spiny neurons in the dorsomedial striatum to represent the state space relevant to the current environment. Conversely, the orbitofrontal cortex is argued to track a subject's position within that state space, likely through modulation of cholinergic interneurons.
Langdon, A. J., Sharpe, M. J., Schoenbaum, G., & Niv, Y. (2018). Model-based predictions for dopamine. Current Opinion in Neurobiology , 49, 1–7. PDFAbstract
Phasic dopamine responses are thought to encode a prediction-error signal consistent with model-free reinforcement learning theories. However, a number of recent findings highlight the influence of model-based computations on dopamine responses, and suggest that dopamine prediction errors reflect more dimensions of an expected outcome than scalar reward value. Here, we review a selection of these recent results and discuss the implications and complications of model-based predictions for computational theories of dopamine and learning.
Hermsdorff, G. B., Pereira, T., & Niv, Y. (2018). Quantifying Humans' Priors Over Graphical Representations of Tasks. In Springer Proceedings in Complexity (pp. 281–290). PDFAbstract
Some new tasks are trivial to learn while others are almost impossible; what determines how easy it is to learn an arbitrary task? Similar to how our prior beliefs about new visual scenes colors our per- ception of new stimuli, our priors about the structure of new tasks shapes our learning and generalization abilities [2]. While quantifying visual pri- ors has led to major insights on how our visual system works [5,10,11], quantifying priors over tasks remains a formidable goal, as it is not even clear how to define a task [4]. Here, we focus on tasks that have a natural mapping to graphs.We develop a method to quantify humans' priors over these “task graphs”, combining new modeling approaches with Markov chain Monte Carlo with people, MCMCP (a process whereby an agent learns from data generated by another agent, recursively [9]). We show that our method recovers priors more accurately than a standard MCMC sampling approach. Additionally, we propose a novel low-dimensional “smooth” (In the sense that graphs that differ by fewer edges are given similar probabilities.) parametrization of probability distributions over graphs that allows for more accurate recovery of the prior and better generalization.We have also created an online experiment platform that gamifies ourMCMCPalgorithm and allows subjects to interactively draw the task graphs. We use this platform to collect human data on sev- eral navigation and social interactions tasks. We show that priors over these tasks have non-trivial structure, deviating significantly from null models that are insensitive to the graphical information. The priors also notably differ between the navigation and social domains, showing fewer differences between cover stories within the same domain. Finally, we extend our framework to the more general case of quantifying priors over exchangeable random structures.
Schuck, N. W., Wilson, R. C., & Niv, Y. (2018). A State Representation for Reinforcement Learning and Decision-Making in the Orbitofrontal Cortex. In Goal-Directed Decision Making. PDFAbstract
Despite decades of research, the exact ways in which the orbitofrontal cortex (OFC) influences cognitive function have remained mysterious. Anatomically, the OFC is characterized by remarkably broad connectivity to sensory, limbic and subcortical areas, and functional studies have implicated the OFC in a plethora of functions ranging from facial processing to value-guided choice. Notwithstanding such diversity of findings, much research suggests that one important function of the OFC is to support decision making and reinforcement learning. Here, we describe a novel theory that posits that OFC's specific role in decision-making is to provide an up-to-date representation of task-related information, called a state representation. This representation reflects a mapping between distinct task states and sensory as well as unobservable information. We summarize evidence supporting the existence of such state representations in rodent and human OFC and argue that forming these state representations provides a crucial scaffold that allows animals to efficiently perform decision making and reinforcement learning in high-dimensional and partially observable environments. Finally, we argue that our theory offers an integrating framework for linking the diversity of functions ascribed to OFC and is in line with its wide ranging connectivity.