## Introduction

Understanding the main function of an organ is important when attempting to either understand the syndromes it engenders, or when attempting to treat them. For instance, a theoretically grounded understanding of the heart’s pumping function allows us to relate shortness of breath to altered pressure gradients. The distinctive function of the brain is computation. The brain processes information, and alters how it processes information as a function of the information it has processed in the past (’learning’). That is, the brain uses information as the currency to make models of the world in order to maximize short- and long-term adaptation to the environment. As such, using computation to turn information into models and to extract information from models is the quintessential function that needs to be understood. The basic premise of computational psychiatry is that alterations in the computations it performs can lead to its malfunction - mental illness. In fact, this view suggests that computational ‘errors’ can lead to illness in the absence of any other ‘neural’ problems, and can even lead to illness as a function of past computations or processed information.

Computational psychiatry views illnesses and symptoms through a computational lens. As an example, consider perceptual disturbances. Perception depends strongly on the disambiguation of ambiguous and noisy sensory information through the integration with other information previously acquired. This integration process can be formalized as a probabilistic inference process. Doing so allows the perceptual disturbances to be characterized and linked to specific underlying processes, and thereby also to the underlying biology.

Prior to proceeding, we emphasize that illnesses are complex phenomena defying simplistic etiological or mechanistic accounts [1]. They are likely pluralistic and multi-causal involving multiple levels [2]. Indeed, research has identified contributions to the syndromes we identify as disorders arising at different levels from genetics to neural circuits, psychological processes, and social or societal factors. From a broad computational view, illness arises when a mismatch occurs between the brain’s computational ability and the environmental or situational demands placed upon it. For instance, alterations in learning from positive and negative decision outcomes, due to an imbalance in corticostriatal dopaminergic function, can cause either impulsivity such as pathological gambling [3], or tenacity in the face of frequent setbacks. Whether the imbalance results in a feature or a problem depends on background factors such as which goals one has in the first place (i.e., what counts as positive or negative: the reward function), and the statistics of rewards and losses.

Computational investigations are often subdivided into three levels [4]. At the most conceptual level, a computational understanding answers questions about what problem the system solves. For instance, what is the problem in finding actions which are good in the longer term? And precisely why should this be done? At a more concrete level, computational models are of algorithmic nature and describe what computations can be used to achieve a particular goal. Finally, computational models can concern the implementation of algorithms. These three levels are in principle independent. However, the strength of computational modeling is that it allows the connections across these levels to be made – and may even be necessary for doing so. More generally, explanatory accounts of psychiatric disorders need to integrate across biological, psychological and social-environmental domains with inherent many to many relationships [5]. The argument proposed here is that a broad computational approach is useful – and maybe even necessary – to do this in a quantitative manner.

The explanatory models we focus on here are often ‘generative' [6,7,8,9], meaning that the models can be run on the experiment that individuals were subjected to and generate comparable data. This has the advantage that the explanatory scope of models can be rigorously tested by comparing data generated by the model with observed experimental data. Furthermore, they do so through the latent manipulation of variables that capture computational processes. As such, these modeling approaches are a way to quantitatively test complex but detailed hypotheses about mental processes (c.f. [10,11,12]). This is in contrast to descriptive models, which may describe the statistical properties of data correctly, but whose internal machinations are less directly interpretable or informative about the underlying mechanisms.

Both theory- and data-driven computational approaches to psychiatric illness are developing rapidly. While previous snapshot summaries of the area exist [8, 13,14,15,16,17,18,19,20,21,22], the rate of progress and the number publications in the field (Fig. 1) both mean that the state of the art rapidly moves past these, and that even in a substantive review as here it will not be possible to review all the work.

The contributions in this volume highlight many facets of data-driven evidence-based and empirical computational work. These are rapidly advancing the field and allowing researchers and clinicians to deal with – and put to good use – the deluge of data facing them. In the present contribution, we will summarize what we view as the most important recent advances in theory-driven computational work relevant to psychiatry [23]. Our aim is two-fold: first, to illustrate, through examples, the usefulness of theory-driven computational approaches for understanding mechanisms in psychiatric illnesses. Second, to provide an updated snapshot of the field. We structure the paper in terms of the class of computational technique used. Briefly, the brain is a dynamical system, and has to solve two fundamental problems: it has to deal with (irreducible) uncertainty, and it has to exert control to survive. As such, we start with dynamical systems, and then turn to Bayesian inference and reinforcement learning. We begin each section with a (very!) brief intuitive summary of that technique and then highlight the important work completed using it over the last few years. We end the review with a summary and synthesis of the progress made, the challenges the field faces and the key next steps.

We note that both inference and learning can be seen as special instances of dynamical systems [24, 25], while in certain situations learning and inference are two faces of the same coin [26, 27]. As such, there are deep mathematical connections between dynamical systems, learning and inference.

## Dynamical models

Mental illness can be conceptualized as a dynamic process, i.e., a state of being that changes over time. Dynamical models focus on the rules that govern the changes of states over time, the response to environmental input and the emerging consequences from these rules. The dynamics involve an interaction of multiple factors, and these interactions can unfold in surprising and complex ways over time. For instance, the textbook view of panic attacks involves a positive feedback cycle, where interoceptive signals such as palpitations augment anxiety, which in turn increases arousal and ventilatory rate resulting in increased interoceptive signals. A positive feedback cycle is one particular dynamical phenomenon which can be displayed by biological, physical, societal and other dynamical systems. The field of dynamical systems theory is concerned with the mathematical characterization of dynamical systems, which are sets of differential equations (or difference equations if in discrete time) describing how variables interact with and influence each other over time (Fig. 2). Such equations can give rise to numerous dynamical phenomena such as attractors, oscillations, phase transitions and even chaos ([28]; see Box 1).

An important insight arising from the study of dynamical systems is that the observed behavior of a system is often independent of the nature of the components involved (Boxes 2 and 3). That is, the same dynamical phenomena can be observed at many different levels and can describe populations of neurons [29], symptoms within an individual [30, 31], interactions between individuals or groups of individuals [32]. Independently of the nature of the unit, the relative behavior of the units will be determined by the dynamical parameters, and they will display phenomena such as attractor states, oscillations and other more complex dynamical phenomena. Furthermore, and maybe most importantly, the behavior of a dynamical system will often determine the overall trajectory of the system, at the same time as being completely at odds with the behavior of the individual components which make up the system.

Hence, the assertion that mental illnesses are dynamic [33,34,35] has profound and potentially far-reaching consequences: it may be impossible to understand the evolution of psychiatric symptoms without understanding the complex interactions joining them together. A consequence of this complexity is that the effect of interventions which alter the state of some components of the system, such as pharmacotherapy or psychotherapy, will depend on the current global state of that system. In other words, as frequently observed in clinical practice, the same treatment administered to the same patient at different times will produce different, and often counterintuitive effects if the state of the patient has changed. Focusing attention on one’s breathing may be good when calm, but exacerbate the panic attack during the attack. Reflecting their general nature, dynamical systems approaches have provided numerous insights into phenomena ranging from intracellular to societal scales.

### Linking cellular to cognitive and circuit processes via dynamical systems

Applications of dynamical systems approach at the circuit level have combined algorithmic and biophysical components, and thereby allowed an understanding of how alterations at the cellular or subcellular level impact the ability of circuits to perform certain functions. Particular attention has been paid to so-called attractor dynamics. One type of attractor is a stable point, where the activity of each unit remains approximately constant over time and returns to this activation level if slightly perturbed. Such stable attractors have been extensively studied as network models for persistent neural activities in working memory [36,37,38], including for their ability to maintain a continuous quantity, e.g., a spatial location [39]. Because the stability of the overall network activity pattern depends on their dynamic properties and how the units interact, such models can be used to examine how properties at the cellular level such as dopamine [40], serotonin [41,42,43] or NMDA receptor function [44, 45] affect the dynamics, and in turn how they affect the ability of the network to retain information.

Indeed, detailed predictions from such a model were shown to capture working memory sensitivity to distractors in schizophrenia [46], while also accounting for the effects of ketamine [45]. Briefly, in this model NMDA receptors on interneurons affect the extent to which neurons inhibit their neighbors. This in turn affects the profile of the stable attractor ‘bump' in the network. A reduction in the efficacy of the NMDA receptor leads to a broadening and reduced stability of the bump, and thereby to an increase in sensitivity to distractors (Fig. 3a). Critically, both the broadening and the increase in distractibility could be demonstrated empirically and shown to be correlated (Fig. 3b; [46]). Indeed, direct recordings of frontal neural assemblies in two animal models of schizophrenia - chronic ketamine administration and a 22q11.2 deletion model - show direct evidence of impaired attractor stability [47], and a related model has recently been shown to explain disruptions in serial dependencies in working memory in schizophrenia and NMDA receptor encephalitis [48]. The notion of a change in attractor properties in schizophrenia has also been proposed in algorithmic models of decision-making [49], which have also suggested specific relationships of to positive and negative symptoms [50]. Circuit-level attractor dynamics can be put to various uses for computational purposes, most classically for pattern completion and memory recollection [51, 52], but also for Bayesian inference [53, 54], multisensory integration [55] and decision-making more generally [56]. As such, these computational models allow a number of higher-level cognitive functions to be related to mechanistic details at the cellular level, and thus afford an understanding of various psychiatric disturbances [21].

Taking a step back, attractor dynamics are one type of computation dependent on the broader principle of balanced excitation and inhibition (E/I). Alterations to E/I balance have been suggested in a number of other illnesses, in particular also in Autism Spectrum Disorders (ASD) [57]. Computational models of E/I imbalance in ASD have focused on a feature of local circuitry called divisive normalization. This is a very widespread computation important for gain adaptation in visual and auditory primary cortices [58, 59], and alterations in models of divisive normalization can account for alterations in visual (c.f. Figure 3) and auditory perception and possibly also higher cognitive functions in autism [60,61,62,63,64]. Divisive normalization has also been suggested as one way of implementing marginalization in neural circuits [65]. Marginalization is a key step in probabilistic inference algorithms (e.g., belief propagation, c.f. Box 1), and this provides one link for how alterations in local E/I balance could have pervasive impacts on many different cognitive functions, and particularly so on functions requiring appropriately dealing with unknown latent variables such as the intentions of others.

At a larger scale, one notable application of dynamical models to interactions between brain areas capitalized on the fact that the dynamical properties of a system depend on the interaction between its components, and therefore alterations in one component can be counteracted by alterations in another [35, 43]. To understand how serotonergic medication might alleviate glutamatergic deficits, [66] built a spiking network model in which cognitive dorsolateral prefrontal cortical (dlPFC) and affective ventral anterior cingulate (vACC) areas had reciprocal inhibitory interactions. Glutamatergic deficits in depression were modeled as a less efficient glutamate clearance, leading to a situation where the affective vACC was hyperactive and impaired the cognitive dlPFC performance through its excitatory projections to dlPFC inhibitory interneurones. Serotonergic medication in the model was effective at treating this by hyperpolarizing the excitatory vACC cells via 5HT-1A receptors.

The dynamical systems applications to psychiatry reviewed so far are relatively detailed, and have mostly been used to give qualitative accounts of experimentally observed phenomena in the sense that they have not (with very few exceptions [67, 68]) directly been fitted to experimental data. As such their strength lies in their ability to qualitatively link biophysical and cellular details to higher-level phenomena.

### Dynamical systems for causality, prediction and control

A different approach has been to fit dynamical systems to time-series data relevant to mental health. There is a very rich literature concerned with efficient identification of dynamical features of systems from multivariate time-series data. Applications include using such dynamical systems models to understand how variables interact; to characterize the overall dynamical characteristics of the system; and to investigate potential interventions, i.e., to identify how to control the system under study.

Probably the most common approach is in fitting dynamical systems to neural data using autoregressive models to estimate Granger causality, or dynamic causal models (DCM; [69]). These have been used extensively over the past two decades to examine interactions between neural substrates and their breakdown in mental illness. They are distinct from functional connectivity approaches which focus on the correlational structure, in that they imply an underlying model of how the data come about, and allow such models to be explicitly tested [6, 69, 70]. DCM connectivity estimates, for instance, have suggested that the absence of illusions such as the hollow-mask illusion in patients with schizophrenia is due to a reduction in the influence of top-down frontal projections [71, 72].

A very promising direction is the combination of the parameters resulting from such fits with machine-learning techniques for classification of predictive purposes [9, 73, 74]. [75] recently applied this to the relationship between fMRI BOLD responses to emotional faces and the long-term course of depressive disorders. Not only did they find that the connectivity estimates differentiated patients with a good and a poor longitudinal course. But because DCM involves the fitting of an interpretable dynamical system, they were able to then investigate parameters of the fits and point to aberrant (reduced) modulation of the connections within and between the amygdala and face perception areas (fusiform and occipital face areas) by emotions. One major limitation of DCM models has traditionally been the need to fit the model to a few selected areas. As such, statements about the causality of interactions were subject to confounds due to other areas not included in the analyses. A whole-brain approach has recently been developed [76, 77] which will help to address this. Although it involves important approximations, this promises to bring the same whole-brain causal connectivity approach to fMRI that autoregressive models and Granger causality have previously brought to EEG and MEG [70]. Interesting newer approaches include the examination of controllability in brain networks [78,79,80], and the use of nonlinear dynamical systems to directly characterize more complex dynamical modes of brain activity [81, 82].

Dynamical systems have also been applied to the analysis of self-report time-series data [83]. Quantitative approaches to within-subject longitudinal data are important. First the temporal course of psychiatric symptoms is an important window to examining the dynamics of mental health disorders. Second, there are fundamental limitations on the ability to generalize from cross-sectional findings to within-subject causes [84, 85]. Third, such an approach implies that symptoms can directly influence each other, or whereby symptoms at least are indicators of processes that can interact directly, and do so independently from any underlying disease process [86, 87]. For instance, sleep disturbances and fatigue are both symptoms of depression in both DSM and ICD [88, 89]. However, it is eminently obvious that sleep disturbances can directly cause daytime fatigue independently of the presence of any depressive disorder. Furthermore, empirical estimates of comorbidity patterns between categorical diagnoses closely track the overlap in symptoms ([86] c.f. [90]).

These observations suggest that patterns of symptoms may inherently stabilize each other, and that hence interactions between symptoms may contribute both to the emergence and stabilization of mental illnesses (e.g., [30, 31]). Indeed, in depression the transition between episodes of wellness and illness shows features of so-called critical slowing-down [91], which is seen when a fixed point becomes unstable and the system transitions into a different stable fixed point. Furthermore, a strong coherence between different symptoms is predictive of a more chronic long-term course of depression [92], possibly because the symptoms maintain each other and stabilize the overall syndrome.

Different types of models have since been applied to self-report time series data, ranging from autoregressive models [93, 94] to linear dynamical systems [95] and highly nonlinear dynamical systems such as Ising models [96, 97] and most recently recurrent neural networks [82]. The relative merits of these approaches are beyond the scope of this review, and there are important questions surrounding the reliability and value of complex measures of such self-report time-series [98]. Nevertheless, a very interesting application of such idiographic research is naturally psychotherapy [99]. For instance, the controllability of a linear dynamical system can be quantified and captures how costly it is to move the system into any target state. Furthermore, such systems can be studied to ask questions such as which symptom is most important in that its alteration would have the strongest desirable impact [100].

## Inference: dealing with uncertainty

Uncertainty is baked into our lives, and two decades of work have shown that the brain pays detailed attention to this [101,102,103]. Uncertainty plays a transdiagnostic role in most if not all mental illnesses, be it because it is underestimated (e.g., in delusions), overestimated and aversive (e.g., in anxieties) or appetitive (e.g., in certain types of impulsivity).

Mathematically, the most consistent and correct approach to dealing with uncertainty is through Bayes’ theorem. This suggests that beliefs should correspond to a distribution over potential explanations h as p(h). This distribution over explanations can be updated with evidence e by multiplying it with the likelihood function p(e | h) such that the new belief state incorporating the evidence is

$$p\left( {h|e} \right) \propto p\left( {e|h} \right)p\left( h \right)$$
(1)

Here, the likelihood term p(e|h) effectively measures how compatible each hypothesis h is with the evidence e, and allows the hypotheses to be weighed by their compatibility with all the evidence or experience. The resulting distribution p(h|e) is the posterior distribution. Further evidence can be included by repeating this step:

$$p\left( {h|e,e_{new}} \right) \propto p\left( {e_{new}|h} \right)p\left( {h|e} \right)$$
(2)

The belief state before including the new evidence p(h|e) was the posterior, but is now the prior. When repeating inference to include new information, the previous conclusion (’posterior’) belief becomes the new prior belief. Hence, priors are one of the vehicles through which past experience shapes the interpretation of the present.

### Disentangling prior from likelihood biases

A number of mental illnesses are thought to be characterized by particular biases in aspects of prior beliefs affecting inference about certain experiences or hypotheses. However, estimating priors is often difficult [104]. This is because in controlled experimental situations priors are measured through responses to ’evidence’ presented in form of stimuli. This, however, usually means that the experimentally observed responses reflect the posterior p(h|e) rather than the prior, and hence differences between individuals could either be due to differences in the prior p(h) or the likelihood p(e|h) [105].

One way of measuring priors is by examining responses to multiple different stimuli where the stimulus provides ambiguous information and the influence of the prior can hence be estimated as the consistent bias across these stimuli. In depression, prior beliefs are thought to be biased towards hypotheses that make losses or punishments more likely. For instance, when presented with ambiguous information about the probability of obtaining a reward, self-reported optimism covaries with a prior belief in a computational model of choices [106], while depressed patients are less able to update their initial belief when presented with disconfirming positive information [107], possibly due to alterations to anterior cingulate regions [108]. Conversely, anxiety seems to be characterized by a pessimistic bias in the process of accumulating evidence, rather than in the prior [109, 110].

Prior beliefs have been extensively examined in research on psychosis, where alterations in the integration of prior beliefs with evidence have long been postulated to underlie hallucinations and delusions [111, 112]. In this research, the term ’prior’ takes on a meaning that is specific to the particular study. While, to date, there is no clear consensus on what the precise pattern of impairments is in psychosis [113], research is progressing rapidly and providing an increasingly nuanced view.

One approach is to examine whether participants automatically infer the statistical regularities in an experiment, and use this to disambiguate information on a given trial. [114] explicitly manipulated this, and found that neither schizotypal nor autistic traits affected it, but that autistic traits were characterized by more accurate perception. The majority of studies have examined more explicitly ‘trained' priors. In these, information in a task can be disambiguated through the use of information provided in another form or another part of the task. For instance, [115] binarized images into black and white such that it was very difficult to recognize any shapes in the images. However, when participants had been pre-exposed to the original image in color, they could leverage this prior information and improve their ability to discriminate the images. Strikingly, participants with early-stage psychosis showed a stronger improvement with this prior information. That is, these data suggest an improvement in the ability to integrate these two types of information, and this integration can, but need not, be viewed as the impact of a prior. Similar finding emerge when using visual stimuli as priors on ambiguous auditory stimuli [116], but not when using auditory stimuli to bias ambiguous visual percepts [117].

A related process is the study of perceptual stability with prolonged bistable stimuli. Here, healthy participants who are delusion-prone [118], and patients with schizophrenia [119] show less stable percepts, suggesting that the pure integration of information over time into stable percepts is impaired (c.f. [120]). This weakened low-level information meant that the percept was more easily shaped by providing cues, which could be viewed as stronger sensitivity to trained prior information [118] in the sense of [115]. Finally, [116] used a computational model based on Eq. (2), to examine the trial-by-trial influence from prior trials onto the current trial, and found that it was a stronger in participants with psychosis.

This latter view was recently refined in a new incentivized version of the classic beads task [121] on which participants with psychosis sampled more information, rather than less. A detailed computational (RL; see below) model carefully adjusted for socioeconomic confounds, suggested that patients with delusions formed stronger ‘prior' beliefs quickly, and then found it difficult to shift away from these [122]. This appears in principle in keeping with a previous study which suggested that the jumping to conclusion bias was driven by noise in the decision-making process, and also not due to sampling costs [123, 124].

### Sequential inference models

Uncertainty is also generated by changes in the world: as the world changes, evidence gathered in the past loses some of its relevance. Some of these changes may be expected, others not. The simplest sequential models simply maintain a running average of experiences, e.g., $$h_{t + 1} = h_t + a(e_t - h_t)$$ (e.g., [125]). Here, the estimation ht is updated by a fraction of the difference between ht and the evidence ht, with the size of this fraction being controlled by the learning rate, α, which is a number between 0 and 1.

An approach that is becoming increasingly popular applies Bayesian inference using latent variable models, such as the Kalman filter or Hidden Markov Models [26, 126] to explicitly estimate sources of uncertainty and effectively adapt the learning rate α over time [127,128,129,130,131,132]. In these models (Fig. 4a), the true state of the world (the true hypothesis h) changes over time. Such changes can be captured in a very general manner by a distribution $$p\left( {h_{t + 1}|h_t} \right)$$ where the latent state at the next time t + 1 depends on the latent state at the current time t. Evidence at time t is now of course directly informative about ht, but the extent to which it is informative about another time depends on how the world evolves, i.e., on $$p\left( {h_{t + 1}|h_t} \right)$$. As such, the maintenance and discarding of past information implicitly represents an assumption about the stability of the world.

For example, variability caused by changes in the underlying association (sometimes referred to as unexpected uncertainty [133]), means that existing beliefs are less likely to accurately reflect current associations so learners should be more influenced by recent outcomes and use a higher learning rate (see Fig. 4). In contrast, variability caused by random chance (sometimes called expected uncertainty [133]), as occurs in less deterministic associations, reduces how informative each outcome is prompting a reduced learning rate. As such, the learning rate can be viewed as an assumption about how rapidly things change or how random outcomes are.

This development has allowed research which asks whether psychiatric symptoms are associated with the ability to flexibly update beliefs. Evidence suggests impaired updating in anxiety [134, 135], particularly with respect to punishments [136], uncertainty in social interactions [137], and autism [138], with studies in patients with schizophrenia reporting both impaired [116, 139] and excessive updating [49]. A related line of work indicates that humans assume positive and negative outcomes differ in their stability and can adjust these separately [140]. This process allows individuals to treat positive and negative outcomes as if they were differentially informative, providing a potential mechanism for the affective biases believed to be causally related to depression [141,142,143].

Finally, uncertainty has a kind of value in itself [144]. It is useful to sample uncertain options as this will improve our understanding of them allowing us to make better future choices [145]. However, sampling the unknown can also be hazardous, particularly in aversive environments where novel options are more likely to be dangerous (perhaps leading to negative prior beliefs about the environment as discussed in section 3.1). An active literature has proposed a range of algorithms by which the value of uncertainty may be estimated and used to bias reinforcement learning [144, 146]. In terms of clinical presentations, in anxiety disorders uncertainty appears to be aversive and avoided [147], while in opioid addiction a tolerance of ambiguity is predictive of relapses [148].

## Reinforcement learning

Symptoms of psychiatric illness very commonly involve alteration of hedonic experience or of behaviors which lead to rewarding or punishing outcomes. This observation has driven interest in how humans learn about rewarding and punishing outcomes and how they use what they have learnt to make decisions. It has also been argued that failure modes in decision-making allow for a principled exploration of dysfunctions on a normative platform [105].

The field of reinforcement learning (RL) is concerned with deriving behavior which maximizes rewards or minimizes losses in the longer term, i.e., not just immediately, but in principle until the end of time. In principle, this is hard, because many things can happen in the future. One of the core insights is that these long-term expectations of future rewards V are governed by a deceptively simple rule:

$${\mathcal{{V}}}\left( s \right) = {\mathbb{E}}\left[ {{\mathcal{{R}}}\left( {s,s^\prime } \right) + {\mathcal{{V}}}\left( {s^\prime } \right)} \right]$$
(3)

This means that the total expected future rewards $${\mathcal{{V}}}\left( s \right)$$ in a state s differ from the total expected future rewards in the next state s’ exactly by the amount of reward received on average when going from s to s’. Taking the difference between these two sides provides the temporal reward prediction error signal which can be used to learn the true $${\mathcal{{V}}}$$ ([149]; see [7] for a very brief introduction).

$${\mathcal{{V}}}_{t + 1}\left( s \right) = {\mathcal{{V}}}_t\left( s \right) + \alpha \left( {r_t + {\mathcal{{V}}}_t\left( {s^\prime } \right) - {\mathcal{{V}}}_t\left( s \right)} \right)$$
(4)

The equation describes how to update the reward expectation of an agent at time t, $${{\mathcal{V}}_{t}} \left( s \right)$$ in response to experienced outcomes, rt and a transition to a new state s’. We note that it is similar to the simple running average equation in the previous section, only that here what is averaged is the immediate reward plus the value of the following state. This bootstrapping is at the core of why reinforcement learning can estimate long-term rewards and support optimal decision-making. Strikingly, dopamine neurons appear to report this prediction error with surprising precision [150], and this has fueled an immense research effort, of which we here review only the most recent advances.

In applying Eq. (4) to mental illnesses, a number of questions immediately arise. First the term rt is supposed to capture both rewards and punishments. Clearly, this is an oversimplification. A further question is how effort should be treated. Second, Eq. (4) effectively describes a type of learning, i.e., of information maintenance. How does this interact with other systems that maintain information, such as working and episodic memory? Third, Eq. (4) does not make reference to knowledge of the world. Clearly, beliefs about how the world work potently influence behavior and learning, and this will be discussed in the section on model-based decision-making. Finally, Eq. (4) makes reference to states s. What are they? A final question is the one discussed in the previous section on what the learning rate α should be.

### Reward sensitivity

One of the core symptoms of depression is anhedonia, a reduction in the subjective experience of rewards and motivation. Several studies have shown associations of anhedonia with reduced learning from rewarding outcomes [151,152,153]. However, dysfunctional reward learning may arise from a number of sources: aberrant updating of values, reduced ability to maintain those values (see section on working memory and RL below), distorted estimates of the reward value of outcomes, or an inability to utilize learned values when selecting actions. Furthermore, learning rate and reward sensitivity trade off each other: in many tasks, it is possible to compensate for any reduction in reward sensitivity by increasing the learning rate. Although there is variability in the literature [152,153,154,155,156], the most consistent effect is that increased anhedonia is associated with a reduced effective value for rewarding outcomes [157,158,159,160]. In other words, individuals with higher anhedonia treat rewarding outcomes as if they were less rewarding than those with lower anhedonia (note there is some evidence that bipolarity is associated with the opposite effect; [161]).

The origin of this reduction is not clear. Primary reward sensitivity e.g., to sucrose or smells does not appear reduced, and the reduction appears most clear in more complex ‘secondary' rewards such as pleasant visual stimuli [162], suggesting a locus of dysfunction in the construction of derived values [163], but still in the absence of impairments in the learning process itself ([164], though see [155]), suggesting a more model-based etiology ([163]; see below). Alternatively, it has been suggested that an individual’s mood may interact with their reward sensitivity, biasing estimates of the value of outcomes [165]. Craving, the desire for drugs of abuse in addiction, indeed does have a multiplicative effect on reward values [166], suggesting that a similar process may also be at work in other illnesses. Indeed, there is some evidence of such sequential interactive processes [167], and it has been suggested that it may improve learning in certain situations [165, 168]. However, a miscalibration of the interplay between mood and estimated value may exacerbate the impact of low mood and lead to fluctuations of mood reminiscent of bipolar disorder [165, 169].

### Effort sensitivity

Motivated behavior involves an adaptive integration of costs vs benefits of engaging in physical or cognitive effort. Several decades of research in rodents and now humans has implicated the dopamine system in the energization of motivated behavior for the sake of maximizing rewards [170], an effect that can be captured by computational models of striatal dopamine in balancing this tradeoff [171, 172].

This framework has been further extended to account for decisions to engage in mental as opposed to physical effort – that is, the choice to perform a cognitively difficult task depending on the incentive [173]. Indeed, recent studies showed that baseline striatal dopamine synthesis capacity, as measured by PET, is predictive of individuals’ willingness to engage in cognitive effort [174]. Critically, this effect was not simply a shift in overall preference. Rather, a behavioral economic analysis showed that dopamine effects on preference were due to an amplification of the (monetary) benefits, together with a diminution of the subjective costs, of engaging in mental effort, and thus mirrored the impact of striatal dopamine on cost/benefit decisions more broadly [171]. Moreover, stimulant medications that elevate striatal dopamine increased cognitive motivation specifically by altering this cost/benefit ratio, most strongly in those subjects with low baseline levels [174]. This study suggests that the use of stimulant medications in ADHD and in the general population might be better understood not by enhancing the ability, but rather the subjective motivation to engage in cognitive processes, and raise the possibility that such assessments could be useful for predicting treatment outcomes.

Effort cost/benefit calculation is thought to be biased in patients with MDD and schizophrenia patients with negative symptoms, who exert reduced physical effort with increasing incentives [175,176,177]. But while dopamine might be involved in emphasizing the benefits over costs, other studies suggest that serotonin is specifically related to cost calculations. Indeed, in a randomized trial in healthy participants, those treated over 8 weeks with escitalopram exhibited increased willingness to exert physical effort for monetary incentives, where the impact of 5HT manipulation could be specifically attributed to a reduction in effort cost [178]. Accordingly, it is notable that remitted MDD patients have also been documented to have a larger sensitivity to anticipated effort cost which could be linked to alterations in computational model parameters, and was predictive of relapse after antidepressant treatment discontinuation [175]. Together, these studies suggest that careful assessment of effort-based decision making – by employing paradigms and models designed to disentangle the component valuation processes – may be promising candidates for predicting treatment outcomes resulting from SSRIs or DA manipulations.

If reductions in cognitive effort associated with apathy or lack of motivation are related to cost/benefit computations, can they be ameliorated by simply increasing incentives? Recent findings suggest that this might be feasible. In a model-based learning task, where cognitive strategies are more effortful but can pay off, reductions in cognitive effort linked to a range of subclinical traits were ameliorated by larger monetary incentives, with these incentive effects especially large in participants with depression, anxiety, or sensation seeking [179]. On the other hand, a transdiagnostic factor of compulsivity was related (in a separate study) to a reduction in mental effort avoidance [180].

Finally, we note that exerting effort is only valuable if the actions are (believed to) directly influence outcomes. Depression, for instance, is characterized by helplessness and hopelessness. Computationally, this can be viewed as a belief that actions have a small probability of leading to the desired outcome [181], which in turn strongly affects the value of those actions [182] and hence the extent to which rewards would be able to motivate efforts associated with the actions.

### Episodic and working memory interactions with RL

The simple RL update rules in Eq. (4) are a type of memory because the output of the update at one time point maintained and used as the input at the next time point. These processes are increasingly being studied in the context of cognitive and neural processes associated with episodic and working memory.

#### Episodic/RL interactions

Episodic memories are affectively biased in several disorders, particularly so in depression, where affectively negative memories are easier to recall. A hallmark of episodic memory is its automatic and incidental encoding of individual sensory events such that they are bound into an episode [183]. In fact, contemporaneous reward prediction error signals can further enhance memories for related events [184, 185], perhaps via neuromodulation of hippocampus [184].

In healthy controls positive RPEs have a larger benefit for memory encoding than negative RPEs. However, in depression, this bias is reversed [141, 143], providing a computational formalism to explain negative memory biases in depression. We note that the standard Eq. (4) does not include different learning rates for rewards and losses. A recent extension of RL has shown that biases in the learning rate from rewards and losses lead to biased estimates; and that a distribution of learning rates can be used to maintain distributions over values, which can allow for more efficient learning [186].

Episodic disturbances also exist in PTSD, where aversive memories are heavily biased towards traumatic events and are thought not to be integrated with other memories [187]. Such views raise complex questions about what it means to ‘integrate' an event memory. One view from RL is that integrating a memory means generalizing the value information from the particular state where it was experienced to others. Thus, in complement to the above description of RL effects on memory; episodic memory contributions can reciprocally influence and augment reinforcement learning and decision making [188]. Indeed, when episodes are sampled from memory, reward-based decisions are biased such that choices are influenced by outcomes linked to that episodic context [189]. Other normative theories suggest that episodic sampling can be useful during offline replay (e.g., during sleep) for prioritizing which events and affective values should be integrated into memory [190]. Such links are promising algorithmic avenues for exploring potential aberrations of memory integration in PTSD.

Finally, RL signals can also regulate decisions about whether a memory should be counted as familiar or not during declarative memory retrieval. Signal detection theory provides a framework to characterize both memory strength (the difference in familiarity between encoded memories and novel ones), and the criterion on the level of memory strength needed to reach a decision as to whether an event is familiar or not. It is now appreciated that this decision criterion is adaptable according to the reinforcement value of previous memory decisions. Indeed, by manipulating reward prediction errors during such a declarative memory task, it was shown that striatal RPEs serve to adapt participants’ criterion for judging events as familiar or not, and that this criterion even transferred to other memory tasks such as free recall [191, 192]. These studies and models could provide a basis for studying pathological adaptation of such criterion leading to biases in memory reporting, for example in prodromal schizophrenia states.

#### Working memory/RL interactions

Working memory can also greatly influence RL processes to augment learning. Much of the RL literature assumes that progressive learning in instrumental tasks relates to striatal dopaminergic function. However, rapid learning in these tasks is strongly supported by prefrontal working memory processes: participants can simply hold in mind recent stimulus-action-outcome associations and use these to improve performance the next time the same stimulus appears. Indeed, one of the most celebrated functions of the prefrontal cortex is the active maintenance of information in working memory (WM) in the service of adaptive behavior [193]. Information that is relevant to a current goal is rapidly updated and maintained over time, and can serve to guide upcoming decisions. Interestingly, the degree to which participants engage such top-down processes can affect the net learning rate of an RL system, providing a systems-level mechanism by which distinct brain systems could contribute to the adjustment of learning rate to surprising or uncertain events, as described above.

Of course, these working memory processes are not perfect: they are subject to capacity limitations and sensitive to forgetting, whereas striatal reinforcement learning processes are more incremental but less subject to capacity constraints. This tradeoff provides a normative motivation for the existence of these complementary systems [194] analogous to that described for model-based vs model-free reinforcement learning (see below) [195]. Moreover, it motivates the critical need to consider PFC and WM contributions when evaluating the nature of RL deficits (or the impact of pharmacological treatments) in patient populations. Indeed, many studies report RL deficits in schizophrenia [196, 197], which are sometimes interpreted solely in context of striatal DA-mediated alterations. However, using experimental paradigms and computational models designed to disentangle these processes, reward learning deficits in medicated schizophrenia can be attributed to pronounced reductions in prefrontal WM contributions, with surprisingly intact learning from (and striatal signaling of) reward prediction errors (Fig. 5; [198,199,200]).

Beyond the independent contributions of these systems, recent studies have further shown that WM and RL processes interact during learning. Top-down WM processes accelerate the acquisition of instrumental contingencies, but because WM can also maintain the expectation that a reward is going to occur, this expectation also reduces the subsequent RPE, as evidenced by both fMRI and EEG [201, 202]. This top-down influence of WM onto RL RPEs is consistent with other observations that cognitive and model-based expectations can modulate model-free RPEs [203, 204] and leads to a counter-intuitive behavioral prediction. Specifically, although instrumental contingencies are learned more rapidly when WM is engaged (i.e., in low load conditions), the reduced RPEs translate to reduced learning in the RL system, leading to more forgetting in the long-term, when WM can no longer be accessed. Conversely, contingencies learned under high WM load are associated with larger RPEs that facilitate plasticity, and are accordingly retained more robustly, as has been shown empirically [198, 202]. This phenomenon may partially explain the surprisingly spared retention of RL contingencies in patients with schizophrenia [198], in spite of profoundly impaired WM during learning, and is concordant with other computational and empirical findings that SZ patients have reduced expected value computations coupled with over-reliance on stimulus-response learning [205]. More speculatively, these findings could imply that other factors that degrade PFC WM processes, such as stress [206], could actually be paradoxically beneficial in terms of long term retention (provided sufficient accumulation of RPEs during initial learning).

Finally, as in episodic memory, RL signals can also reciprocally influence decisions about WM. That is, how does the brain decide which of several potential pieces of information to store in mind, what to ignore, and when to discard previously maintained information? In models and data, RPEs are used for learning such gating policies in the service of optimizing memory representations that are most useful for task performance [207, 208]. Moreover, in addition to guiding which representations to store, RL strategies are also used to adaptively modulate how much information can be compressed or chunked in working memory [209], an instantiation of the more general notion that many cognitive heuristics or biases can be understood as rational in the face of limited resources [210]. Moreover, this RL-WM interaction provides a coherent set of mechanisms relevant for understanding suboptimal resource allocation, distractibility, and attentional focus, all of which are features of disorders in frontostriatal circuitry including schizophrenia, Parkinson’s disease, ADHD, and OCD [211].

### Model-based inference: from urges to meaning

Beauty lies in the eye of the beholder, and the meaning or value of events can be profoundly altered in mental illnesses - witness the interpretation of mundane events as profoundly meaningful in delusional mood, or the cognitive distortions characteristic of depression. Formally, how an event influences us in the future depends on what aspects of the event are stored in memory, and how.

The models described so far in Eqs. (4) and (1) are retrospective: Beliefs are purely a function of the past, and the future is expected to behave just as the past did. In both those equations, experiences are evaluated with respect to current reward expectations, used to adapt these, and then discarded. Because α has to be small (to avoid switching after each individual experience), this means that a change in how rewarding an event is will only lead to a change in expected reward after a (large) number of experiences. An alternative approach is to use experiences to build an explicit model of the world, and then use this model to prospectively derive expectations about likely rewards by simulating what might happen in the future [212]. The strength of this model-based approach is that it affords more flexibility to react to a change in how rewarding future events are, but that comes at the computational cost of having to simulate potentially exponentially many future possibilities [195]. While model-based RL is thought to relate to goal-directed decision-making, model-free RL as in Eq. (4) is thought to relate to habitual decisions and incentive salience [195, 204, 213,214,215].

Several aspects of this model-based/model-free RL distinction are relevant to mental illness. First, and most prominently, a shift away from goal-directed and towards habitual decision-making ‘urges' [216] has been demonstrated for OCD [217,218,219,220] and associated with prefrontal and myelination impairments [217, 220, 221]. However, this shift is not diagnostically specific, but also extends to a number of other mental illnesses including binge eating, methamphetamine dependence [220] and schizophrenia [222] but not alcohol use disorder [220, 223]. Strikingly, the association is strongest with a ‘compulsive' factor extracted across several different questionnaire measures [179, 180, 224,225,226]. Finally, it appears to have trait features as it does not change with improvement in OCD symptoms [227], even though it is highly sensitive to stress and cognitive load [228,229,230].

Second, model-based decision-making formalises how beliefs can fundamentally alter how experience influences behavior, i.e., what they ‘mean'. This is demonstrated in the now classic experiment by [204]. Depending on the model, a reward can lead to repeating the action that led to it, or it can lead to avoiding it. The avoidance in this case is driven by the interpretation that a different course of action than the one taken can enhance the chances of another reward. Furthermore, full model-based evaluation is so computationally demanding that it only feasible in scenarios that are so simple as to be irrelevant. Hence, goal-directed decision-making is itself subject to a number of approximations, or internal ‘decisions’ about which aspects of the future to sample [209, 231,232,233]. Again, these internal decisions must be led by approximate heuristics which can easily result in profound interpretational biases [234]. For instance, discounting temporally distant events relative to proximal ones is a prominent transdiagnostic feature of many illnesses [235, 236], and temporal discounting can be altered by instructing participants to imagine (i.e., to internally simulate) the temporally distant events [237, 238]. As such, a component of temporal discounting may be driven by internal decisions not to simulate or ‘think of’ certain future events. Similarly, aspects of anxiety [239] and paranoia [124] are thought to relate to model-based assumptions about the future ability to make good choices. Indeed, there are goal-directed components in threat aversion [240], suggesting that imaginary exposure in the psychotherapy of anxiety disorders may act by addressing internal sampling biases [234]. Finally, we note here the relationship between internal simulation decisions and metacognition [226, 241].

Third, however, the experimental ability to distinguish between model-free and model-based decisions depends on the definition of ‘states' and ‘actions' [242,243,244,245,246], or more generally on the nature of the representation. We will turn to this next.

### Structure and abstraction learning

Arguably, for mental illness, we are most interested in the nature of more abstract cognitive representations that are used to constrain the state space used for learning and which facilitate transfer and generalization to novel environments. For example, patients with autism exhibit changes in their ability to extract such abstract structure [247].

Model-based processing, in which a person represents the full transition structure of the consequences of their actions on future states, provides one means to be flexible, but is very computationally expensive. While a person could attempt to re-use models from previous contexts, a more efficient strategy is to learn task representations that facilitate re-use of critical components of previous task settings while collapsing over irrelevant aspects, and flexibly recombining bits of learned knowledge to novel situations [248, 249]. Doing so requires aspects of the ‘model' to be ‘factorized' (that is kept separate from other aspects) while also learning whether such factorization is useful for the given environment. Humans show such flexibility in “generalizing to generalize” which are well captured by such computational considerations [250], but we have yet to understand the mechanisms by which this process occurs, or whether it can be used to understand poor abstraction or aberrant generalization in patients with mental illness.

Generalizing inherently requires learning representations that are compressed: those that retain critical elements of a task or environment structure while discarding details that may not transfer to other situations. While the hippocampus is thought to provide highly pattern-separated conjunctive representations storing specifics, the cortex is thought to provide more elemental and abstract representations [183, 251, 252]. In reinforcement learning, the “successor representation” provides one algorithmic strategy lying in between model-based and model-free learning, which retains aspects of a world model used for planning without all of the specifics [253,254,255,256,257,258]. Here, a model is represented by considering the impact of the person’s actions on the predicted visitation frequency, and reward-predictive values, of future states, without requiring explicit enumeration of each future action and state transition. Mathematically, this is equivalent to learning the predicted sequence of reward prediction errors given the person’s actions, while discarding the specific state transitions [254]. Indeed, if one learns abstract structures using only reward-predictive representations, this reduces the dimensionality of the state space in such a way that permits transfer to novel environments that have similar abstract features, even if the specifics of both transitions and rewards change [255]. Critically, such abstract transfer is not afforded by reduced representations that merely maximize reward in the original environment. These computational considerations motivate the study of which brain systems and mechanisms can support learning and re-use of such abstractions and whether they can be fruitfully interrogated to understand the nature of developmental learning disabilities such as ASD.

### Pavlovian influences

An important aspect of structure learning relates to a well-established distinction in the animal learning literature, namely the distinction between instrumental conditioning, where reinforcements depend on the animals’ behavior, and Pavlovian conditioning, where reinforcements are delivered irrespective of what the animal does. In the latter case, animals (and humans) nevertheless still show behavior - even though it is not necessary. In fact, these behavioral tendencies are often immutable: animals cannot learn not to salivate when they hear the buzzer. Similar strong Pavlovian tendencies are observable in humans and can profoundly impact decision-making and learning [259, 260]. One way to formally describe these is as a state value which mandates a particular action (e.g., appetitive → approach) [259, 261, 262] and thereby can interfere with instrumental behavior. Alternative possibilities have been considered formally and empirically [263,264,265].

Pavlovian influences are increased in patients with alcohol use disorder and are predictive of relapse [266] unlike model-based decision-making [223]. Pavlovian escape influences are increased in suicidal patients [267, 268]. In anxiety, there is a subtly different bias towards avoidance behaviors that is independent of Pavlovian values [269].

## Future research directions

There are at several challenges for computational psychiatry to generate a satisfactory explanatory disease model. First, there is evidence from genetic [270, 271] and circuit level assessments [272, 273] of psychiatric constructs and disorders that there is significant biological and psychological heterogeneity within and across disorders of a similar class. This means that diagnostic labels likely comprise individuals with differing underlying biological architectures. In fact, it has recently been estimated that as much as 80% of polygenic constructs such as anxiety or neuroticism may be due to rare genetic variants that are distributed across the entire genome [274]. The converse, however, is also true: not only does the brain have many ways of producing the same symptoms; the very similar brain dysfunctions can also produce a number of different clinical symptoms. Consider for instance the phenotypic heterogeneity of Huntington's disease. Although as an autosomal dominant disorder it has a simple genetic basis, the clinical variability this results in via the modulation of multiple biochemical pathways is enormous [275]. These clearly are tall orders, and no easy solutions should be expected anytime soon. However, the arguments laid out here suggest that it will be difficult to cut this double Gordian knot without building a computational framework that is able to relate implementational to algorithmic and functional levels.

Second, explanatory variables capture only a small fraction of the observed variance - they do not yet explain enough [276,277,278]. More specifically, many symptoms of mental illnesses are self-reports expressed in words, and the ability to detect subtle hints in the language of patients is both an important facet of clinicians’ skill, but also one that is hard to quantify and hence may contribute to idiosyncrasies and poor agreement between raters. While some of the work reviewed employs causal manipulations that directly alters self-report (e.g., [164]; see also [279]), most of the work reviewed here attempts to gain an understanding of these symptoms through cross-sectional correlations, and these tend to be low even when they replicate robustly [225, 277]. Even if these correlations were high, the guarantees necessary for cross-sectional patterns to be meaningful for individual subjects longitudinally are unlikely to be given [84, 85]. Furthermore, while different, putatively more objective, task-based measures show comparatively better coherence amongst each other, and so do different self-report measures, the coherence between task-derived and self-report measures is relatively poor (Fig. 6a).

Third, part of this is due to an aspect of mechanistic research that was underappreciated until recently, namely the tendency to squash between-subject variability [280]. Although a number of putatively mechanistically informative task-derived measurements are highly robust at the group level, they often show poor test-retest reliability, meaning that individual differences are not robust, and less robust than self-report measures (Fig. 6b; [280,281,282,283]). One reason in particular is that group-level effects are maximized when individual differences are minimized. As most mechanistic research employs group-level approaches to discover shared mechanisms, individual variation has often intentionally been suppressed (Fig. 6c). The fitting of generative computational models to data may have an important role to play. Such models can capture multiple aspects of data, such as choices and reaction time, and ensure consistency across all aspects of the data [7]. As such, they can improve the measurement properties by reducing noise and improving test-retest validity (e.g., [284, 285]).

## Conclusion

Computational psychiatry is a rapidly growing field that combines both data-driven and theory-driven approaches. This review of theory-driven work has shown that investigations into dynamical, inference and learning aspects of mental illnesses are progressing apace and becoming mature. They are allowing increasingly tight relationships between detailed cellular and cognitive processes to be forged and some of these have shown predictive power in longitudinal studies.

As outlined previously [286], a core goal for computational psychiatry is to accelerate the translation of (computational) neuroscience into improved patient outcomes. The paths through which computational methods can support this goal are manifold. First, the focus in this review was on mechanisms. We have illustrated how computational approaches allow mechanistic hypotheses and processes to be tested. In addition, because the brain has a computational function at its heart, they are unavoidable when attempting to grapple with the malfunctions observed in mental illnesses. Second, computational approaches may provide tools for the measurement of these processes, and thereby facilitate precision-psychiatric approaches. For instance, tasks can be used to measure different aspects of learning and inference, and these may be helpful for treatment stratification. Third, the identification of computational processes can motivate novel approaches and interventions. For instance, the work reviewed on the importance of working memory for reinforcement learning in schizophrenia, or on the separate malleability of learning rates for appetitive and aversive events opens up novel potential therapeutic interventions.

Nevertheless, to take this forward, we believe that the field requires a dedicated focus on clinical applications. The field may benefit from a move away from cross-sectional research and towards longitudinal causal or quasi-causal study designs to understand how individuals change over time and respond to interventions.

The cost of acquiring data, and the importance of devising procedures that are robust across labs and indeed across international clinical settings renders relatively large-scale collaborations and consortia critically important [287]. Such collaborations could also be instrumental in setting standards and agreeing on the kinds of details which will make modeling a robust technique for clinical applications.

## Funding and disclosure

MB is supported by the Oxford Health NIHR Biomedical Research Centre. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health. MB has received grants from the MRC, Wellcome Trust and NIHR. MB has acted as a consultant for J&J and CHDR and has received travel funds from Lundbeck. He owns shares in P1vital Products Ltd. MJF is supported by NIMH, and is a consultant for F Hoffman LaRoche pharmaceuticals. MP acknowledges support by The William K. Warren Foundation, the National Institute on Drug Abuse (U01 DA041089), and the National Institute of General Medical Sciences Center Grant Award Number (1P20GM121312). MP is an advisor to Spring Care, Inc., a behavioral health startup, he has received royalties for an article about methamphetamine in UpToDate. QJMH acknowledges support by the UCL NIHR Biomedical Research Centre and the Max Planck Society. QJMH has received grants from the Swiss National Science Foundation, the EMDO foundation and the German Research Foundation. QJMH declares no conflicts of interest.