Schuck, N. W., Wilson, R. C., & Niv, Y. (2018). A State Representation for Reinforcement Learning and Decision-Making in the Orbitofrontal Cortex. In Goal-Directed Decision Making. PDFAbstract
Despite decades of research, the exact ways in which the orbitofrontal cortex (OFC) influences cognitive function have remained mysterious. Anatomically, the OFC is characterized by remarkably broad connectivity to sensory, limbic and subcortical areas, and functional studies have implicated the OFC in a plethora of functions ranging from facial processing to value-guided choice. Notwithstanding such diversity of findings, much research suggests that one important function of the OFC is to support decision making and reinforcement learning. Here, we describe a novel theory that posits that OFC's specific role in decision-making is to provide an up-to-date representation of task-related information, called a state representation. This representation reflects a mapping between distinct task states and sensory as well as unobservable information. We summarize evidence supporting the existence of such state representations in rodent and human OFC and argue that forming these state representations provides a crucial scaffold that allows animals to efficiently perform decision making and reinforcement learning in high-dimensional and partially observable environments. Finally, we argue that our theory offers an integrating framework for linking the diversity of functions ascribed to OFC and is in line with its wide ranging connectivity.
Cohen, J. D., Daw, N. D., Engelhardt, B., Hasson, U., Li, K., Niv, Y., Norman, K. A., et al. (2017). Computational approaches to fMRI analysis. Nature Neuroscience , 20 (3), 304–313. PDFAbstract
Multi-walled carbon nanotubes (MWCNT) and carbon nanofibers (CNF) were created using chemical vapor deposition at growth temperatures between 500 and 750 ??C, which have increasing crystallinity with increasing growth temperature. We used Raman spectroscopy to analyze the samples. The intensity ratios compared to the G-band, and full-width at half-maximum, of all observable Raman bands in both the first and second-order region were investigated. Good match was observed for the defect related bands of the MWCNT samples and data found in the literature. Several second-order bands display a strong dependency to growth temperature. Similar growth temperature (and thus defect) dependencies were found between several first and second-order bands, which might aid in determining the physical causes of these bands. CNF show much weaker Raman features due to their low crystallinity, making them more difficult to analyse. The results of this work are used to give recommendations on how to investigate MWCNT and CNF crystallinity using Raman spectroscopy. Finally, we demonstrate that Raman spectroscopy can be used to distinguish between the MWCNT root and tip growth mechanism. ?? 2012 Elsevier Ltd. All rights reserved.
Gershman, S. J., Monfils, M. - H., Norman, K. A., & Niv, Y. (2017). The computational nature of memory modification. eLife , 6. PDFAbstract
\textlessp\textgreaterRetrieving a memory can modify its influence on subsequent behavior. We develop a computational theory of memory modification, according to which modification of a memory trace occurs through classical associative learning, but which memory trace is eligible for modification depends on a structure learning mechanism that discovers the units of association by segmenting the stream of experience into statistically distinct clusters (latent causes). New memories are formed when the structure learning mechanism infers that a new latent cause underlies current sensory observations. By the same token, old memories are modified when old and new sensory observations are inferred to have been generated by the same latent cause. We derive this framework from probabilistic principles, and present a computational implementation. Simulations demonstrate that our model can reproduce the major experimental findings from studies of memory modification in the Pavlovian conditioning literature.\textless/p\textgreater
DuBrow, S., Rouhani, N., Niv, Y., & Norman, K. A. (2017). Does mental context drift or shift? Current Opinion in Behavioral Sciences , 17, 141–146. PDFAbstract
Theories of episodic memory have proposed that individual memory traces are linked together by a representation of context that drifts slowly over time. Recent data challenge the notion that contextual drift is always slow and passive. In particular, changes in one's external environment or internal model induce discontinuities in memory that are reflected in sudden changes in neural activity, suggesting that context can shift abruptly. Furthermore, context change effects are sensitive to top-down goals, suggesting that contextual drift may be an active process. These findings call for revising models of the role of context in memory, in order to account for abrupt contextual shifts and the controllable nature of context change.
Leong, Y. C., Radulescu, A., Daniel, R., DeWoskin, V., & Niv, Y. (2017). Dynamic Interaction between Reinforcement Learning and Attention in Multidimensional Environments. Neuron , 93 (2), 451–463. PDFAbstract
Little is known about the relationship between attention and learning during decision making. Using eye tracking and multivariate pattern analysis of fMRI data, we measured participants' dimensional attention as they performed a trial-and-error learning task in which only one of three stimulus dimensions was relevant for reward at any given time. Analysis of participants' choices revealed that attention biased both value computation during choice and value update during learning. Value signals in the ventromedial prefrontal cortex and prediction errors in the striatum were similarly biased by attention. In turn, participants' focus of attention was dynamically modulated by ongoing learning. Attentional switches across dimensions correlated with activity in a frontoparietal attention network, which showed enhanced connectivity with the ventromedial prefrontal cortex between switches. Our results suggest a bidirectional interaction between attention and learning: attention constrains learning to relevant dimensions of the environment, while we learn what to attend to via trial and error.
Sharpe, M. J., Marchant, N. J., Whitaker, L. R., Richie, C. T., Zhang, Y. J., Campbell, E. J., Koivula, P. P., et al. (2017). Lateral Hypothalamic GABAergic Neurons Encode Reward Predictions that Are Relayed to the Ventral Tegmental Area to Regulate Learning. Current Biology , 27 (14), 2089––2100.e5. PDFAbstract
Eating is a learned process. Our desires for specific foods arise through experience. Both electrical stimulation and optogenetic studies have shown that increased activity in the lateral hypothalamus (LH) promotes feeding. Current dogma is that these effects reflect a role for LH neurons in the control of the core motivation to feed, and their activity comes under control of forebrain regions to elicit learned food-motivated behaviors. However, these effects could also reflect the storage of associative information about the cues leading to food in LH itself. Here, we present data from several studies that are consistent with a role for LH in learning. In the first experiment, we use a novel GAD-Cre rat to show that optogenetic inhibition of LH \$\gamma\$-aminobutyric acid (GABA) neurons restricted to cue presentation disrupts the rats' ability to learn that a cue predicts food without affecting subsequent food consumption. In the second experiment, we show that this manipulation also disrupts the ability of a cue to promote food seeking after learning. Finally, we show that inhibition of the terminals of the LH GABA neurons in ventral-tegmental area (VTA) facilitates learning about reward-paired cues. These results suggest that the LH GABA neurons are critical for storing and later disseminating information about reward-predictive cues.
Auchter, A., Cormack, L. K., Niv, Y., Gonzalez-Lima, F., & Monfils, M. - H. (2017). Reconsolidation-Extinction Interactions in Fear Memory Attenuation: The Role of Inter-Trial Interval Variability. Frontiers in Behavioral Neuroscience , 11. PDFAbstract
Most of life is extinct, so incorporating some fossil evidence into analyses of macroevolution is typically seen as necessary to understand the diversification of life and patterns of morphological evolution. Here we test the effects of inclusion of fossils in a study of the body size evolution of afrotherian mammals, a clade that includes the elephants, sea cows and elephant shrews. We find that the inclusion of fossil tips has little impact on analyses of body mass evolution; from a small ancestral size (approx. 100 g), there is a shift in rate and an increase in mass leading to the larger-bodied Paenungulata and Tubulidentata, regardless of whether fossils are included or excluded from analyses. For Afrotheria, the inclusion of fossils and morphological character data affect phylogenetic topology, but these differences have little impact upon patterns of body mass evolution and these body mass evolutionary patterns are consistent with the fossil record. The largest differences between our analyses result from the evolutionary model, not the addition of fossils. For some clades, extant-only analyses may be reliable to reconstruct body mass evolution, but the addition of fossils and careful model selection is likely to increase confidence and accuracy of reconstructed macroevolutionary patterns.
Eldar, E., Cohen, J. D., & Niv, Y. (2016). Amplified selectivity in cognitive processing implements the neural gain model of norepinephrine function. The Behavioral and brain sciences , 39, e206. PDFAbstract
\textlessp\textgreaterPrevious work has suggested that an interaction between local selective (e.g., glutamatergic) excitation and global gain modulation (via norepinephrine) amplifies selectivity in information processing. Mather et al. extend this existing theory by suggesting that localized gain modulation may further mediate this effect – an interesting prospect that invites new theoretical and experimental work.\textless/p\textgreater
Cai, M. B., & Schuck, N. W. (2016). A Bayesian method for reducing bias in neural representational similarity analysis. In D. D. Lee, U. V. Luxburg, I. Guyon, & R. Garnett (Ed.), Advances In Neural Information Processing Systems 29 (pp. 4952–4960) . Curran Associates, Inc. PDF
Kurth-Nelson, Z., O'Doherty, J. P., Barch, D. M., Denève, S., Durstewitz, D., Frank, M. J., Gordon, J. A., et al. (2016). Computational Approaches for Studying Mechanisms of Psychiatric Disorders. In Computational Psychiatry . The MIT Press. PDFAbstract
Vast spectra of biological and psychological processes are potentially involved in the mechanisms of psychiatric illness. Computational neuroscience brings a diverse toolkit to bear on understanding these processes. This chapter begins by organizing the many ways in which computational neuroscience may provide insight to the mechanisms of psychiatric illness. It then contextualizes the quest for deep mechanistic understanding through the perspective that even partial or nonmechanistic understanding can be applied productively. Finally, it questions the standards by which these approaches...
Eldar, E., Niv, Y., & Cohen, J. D. (2016). Do You See the Forest or the Tree? Neural Gain and Breadth Versus Focus in Perceptual Processing. Psychological Science , 27 (12), 1632–1643. PDFAbstract
When perceiving rich sensory information, some people may integrate its various aspects, whereas other people may selectively focus on its most salient aspects. We propose that neural gain modulates the trade-off between breadth and selectivity, such that high gain focuses perception on those aspects of the information that have the strongest, most immediate influence, whereas low gain allows broader integration of different aspects. We illustrate our hypothesis using a neural-network model of ambiguous-letter perception. We then report an experiment demonstrating that, as predicted by the model, pupil-diameter indices of higher gain are associated with letter perception that is more selectively focused on the letter's shape or, if primed, its semantic content. Finally, we report a recognition-memory experiment showing that the relationship between gain and selective processing also applies when the influence of different stimulus features is voluntarily modulated by task demands.
Arkadir, D., Radulescu, A., Raymond, D., Lubarr, N., Bressman, S. B., Mazzoni, P., & Niv, Y. (2016). DYT1 dystonia increases risk taking in humans. eLife , 5 (JUN2016). PDFAbstract
It has been difficult to link synaptic modification to overt behavioral changes. Rodent models of DYT1 dystonia, a motor disorder caused by a single gene mutation, demonstrate increased long-term potentiation and decreased long-term depression in corticostriatal synapses. Computationally, such asymmetric learning predicts risk taking in probabilistic tasks. Here we demonstrate abnormal risk taking in DYT1 dystonia patients, which is correlated with disease severity, thereby supporting striatal plasticity in shaping choice behavior in humans.
Radulescu, A., Daniel, R., & Niv, Y. (2016). The effects of aging on the interaction between reinforcement learning and attention. Psychology and Aging , 31 (7), 747–757. PDFAbstract
Predicting the binding mode of flexible polypeptides to proteins is an important task that falls outside the domain of applicability of most small molecule and protein−protein docking tools. Here, we test the small molecule flexible ligand docking program Glide on a set of 19 non-\$\alpha\$-helical peptides and systematically improve pose prediction accuracy by enhancing Glide sampling for flexible polypeptides. In addition, scoring of the poses was improved by post-processing with physics-based implicit solvent MM- GBSA calculations. Using the best RMSD among the top 10 scoring poses as a metric, the success rate (RMSD ≤ 2.0 \AAfor the interface backbone atoms) increased from 21% with default Glide SP settings to 58% with the enhanced peptide sampling and scoring protocol in the case of redocking to the native protein structure. This approaches the accuracy of the recently developed Rosetta FlexPepDock method (63% success for these 19 peptides) while being over 100 times faster. Cross-docking was performed for a subset of cases where an unbound receptor structure was available, and in that case, 40% of peptides were docked successfully. We analyze the results and find that the optimized polypeptide protocol is most accurate for extended peptides of limited size and number of formal charges, defining a domain of applicability for this approach.
Schuck, N. W., Cai, M. B., Wilson, R. C., & Niv, Y. (2016). Human Orbitofrontal Cortex Represents a Cognitive Map of State Space. Neuron , 91 (6), 1402–1412. PDFAbstract
Although the orbitofrontal cortex (OFC) has been studied intensely for decades, its precise functions have remained elusive. We recently hypothesized that the OFC contains a “cognitive map” of task space in which the current state of the task is represented, and this representation is especially critical for behavior when states are unobservable from sensory input. To test this idea, we apply pattern-classification techniques to neuroimaging data from humans performing a decision-making task with 16 states. We show that unobservable task states can be decoded from activity in OFC, and decoding accuracy is related to task performance and the occurrence of individual behavioral errors. Moreover, similarity between the neural representations of consecutive states correlates with behavioral accuracy in corresponding state transitions. These results support the idea that OFC represents a cognitive map of task space and establish the feasibility of decoding state representations in humans using non-invasive neuroimaging.
Eldar, E., Rutledge, R. B., Dolan, R. J., & Niv, Y. (2016). Mood as Representation of Momentum. Trends in Cognitive Sciences , 20 (1), 15–24. PDFAbstract
Experiences affect mood, which in turn affects subsequent experiences. Recent studies suggest two specific principles. First, mood depends on how recent reward outcomes differ from expectations. Second, mood biases the way we perceive outcomes (e.g., rewards), and this bias affects learning about those outcomes. We propose that this two-way interaction serves to mitigate inefficiencies in the application of reinforcement learning to real-world problems. Specifically, we propose that mood represents the overall momentum of recent outcomes, and its biasing influence on the perception of outcomes 'corrects' learning to account for environmental dependencies. We describe potential dysfunctions of this adaptive mechanism that might contribute to the symptoms of mood disorders. ©2015 The Authors.
Chan, S. C. Y., Niv, Y., & Norman, K. A. (2016). A probability distribution over latent causes, in the orbitofrontal cortex. Journal of Neuroscience , 36 (30), 7817–7828. PDFAbstract
The orbitofrontal cortex (OFC) has been implicated in both the representation of "state," in studies of reinforcement learning and decision making, and also in the representation of "schemas," in studies of episodic memory. Both of these cognitive constructs require a similar inference about the underlying situation or "latent cause" that generates our observations at any given time. The statistically optimal solution to this inference problem is to use Bayes' rule to compute a posterior probability distribution over latent causes. To test whether such a posterior probability distribution is represented in the OFC, we tasked human participants with inferring a probability distribution over four possible latent causes, based on their observations. Using fMRI pattern similarity analyses, we found that BOLD activity in the OFC is best explained as representing the (log-transformed) posterior distribution over latent causes. Furthermore, this pattern explained OFC activity better than other task-relevant alternatives, such as the most probable latent cause, the most recent observation, or the uncertainty over latent causes. ©2016 the authors.
Niv, Y., & Langdon, A. J. (2016). Reinforcement learning with Marr. Current Opinion in Behavioral Sciences , 11, 67–73. PDFAbstract
To many, the poster child for David Marr's famous three levels of scientific inquiry is reinforcement learning – a computational theory of reward optimization, which readily prescribes algorithmic solutions that evidence striking resemblance to signals found in the brain, suggesting a straightforward neural implementation. Here we review questions that remain open at each level of analysis, concluding that the path forward to their resolution calls for inspiration across levels, rather than a focus on mutual constraints.
Takahashi, Y. K., Langdon, A. J., Niv, Y., & Schoenbaum, G. (2016). Temporal Specificity of Reward Prediction Errors Signaled by Putative Dopamine Neurons in Rat VTA Depends on Ventral Striatum. Neuron , 91 (1), 182–193. PDFAbstract
Dopamine neurons signal reward prediction errors. This requires accurate reward predictions. It has been suggested that the ventral striatum provides these predictions. Here we tested this hypothesis by recording from putative dopamine neurons in the VTA of rats performing a task in which prediction errors were induced by shifting reward timing or number. In controls, the neurons exhibited error signals in response to both manipulations. However, dopamine neurons in rats with ipsilateral ventral striatal lesions exhibited errors only to changes in number and failed to respond to changes in timing of reward. These results, supported by computational modeling, indicate that predictions about the temporal specificity and the number of expected reward are dissociable and that dopaminergic prediction-error signals rely on the ventral striatum for the former but not the latter.
Gershman, S. J., Norman, K. A., & Niv, Y. (2015). Discovering latent causes in reinforcement learning. Current Opinion in Behavioral Sciences , 5 43–50. PDFAbstract
Effective reinforcement learning hinges on having an appropriate state representation. But where does this representation come from? We argue that the brain discovers state representations by trying to infer the latent causal structure of the task at hand, and assigning each latent cause to a separate state. In this paper, we review several implications of this latent cause framework, with a focus on Pavlovian conditioning. The framework suggests that conditioning is not the acquisition of associations between cues and outcomes, but rather the acquisition of associations between latent causes and observable stimuli. A latent cause interpretation of conditioning enables us to begin answering questions that have frustrated classical theories: Why do extinguished responses sometimes return? Why do stimuli presented in compound sometimes summate and sometimes do not? Beyond conditioning, the principles of latent causal inference may provide a general theory of structure learning across cognitive domains.
Niv, Y., Langdon, A. J., & Radulescu, A. (2015). A free-choice premium in the basal ganglia. Trends in Cognitive Sciences , 19 (1), 4–5. PDFAbstract
Apparently, the act of free choice confers value: when selecting between an item that you had previously chosen and an identical item that you had been forced to take, the former is often preferred. What could be the neural underpinnings of this free-choice bias in decision making? An elegant study recently published in Neuron suggests that enhanced reward learning in the basal ganglia may be the culprit.