
Figure 1.
Schematic of the learning and choice models discussed in the article. a) Illustrative example of reinforcement learning models. b) Illustrative example of reaction time models (e.g., drift diffusion models). c) Illustrative example of models of economic choice under uncertainty.
Table 1.
Reinforcement learning model parameters that could be altered in anhedonia
| Construct | Description | Computational instantiation | Evidence implicating | Evidence exonerating | Missing evidence |
|---|---|---|---|---|---|
| Value-guided behavior | Capacity of value representations to guide choice | Value (Equation 1) | Most studies report broadly intact acquisition | ||
| Feedback insensitivity | “Blunted” response to feedback, both positive and negative | Reduced learning rate (Equation 2) | Chase, Frank et al. (2010), Steele et al. (2007) | Rothkirch, Tonn, Kohler, & Sterzer (2017) | |
| Enhanced punishment sensitivity | Relatively enhanced response to negative feedback | Enhanced learning rate if outcome is aversive | Beevers et al. (2013), Herzallah et al. (2013), Maddox et al. (2012), Murphy, Michael, Robbins, & Sahakian (2003), Taylor Tavares et al. (2008) | Cavanagh, Bismark, Frank, & Allen (2011), Chase, Frank et al. (2010), Whitmer, Frank, & Gotlib (2012) | |
| Reduced reward sensitivity | Relatively reduced response to positive feedback | Reduced learning rate if outcome is appetitive | Beevers et al. (2013), DelDonno et al. (2015), Herzallah et al. (2013), Kunisato et al. (2012), Maddox et al. (2012), O. J. Robinson et al. (2012), Treadway, Bossaller, Shelton, & Zald (2012) | Cavanagh et al. (2011), Chase, Frank et al. (2010), Chase, Michael, Bullmore, Sahakian, & Robbins (2010), Whitmer et al. (2012) | |
| Pavlovian bias | Influence of reward- or punishment-predictive stimuli on behavior | See Equation 3 | Bylsma, Morris, & Rottenberg (2008), Huys, Golzer et al. (2016), Radke, Guths, Andre, Muller, & de Bruijn (2014); see Mkrtchian, Aylward, Dayan, Roiser, & Robinson (2017) for anxiety | ||
| Temperature | Stochastic choice | Temperature (Equation 4) | Huys et al. (2012), Huys et al. (2013), Kunisato et al. (2012); for indirect evidence, see Blanco, Otto, Maddox, Beevers, & Love (2013), Clery-Melin et al. (2011); for trend level, see Chase et al. (2017) | Chung et al. (2017), Rothkirch et al. (2017) | |
| Reduced outcome magnitude sensitivity | Linear or nonlinear scaling of utility across increasing expected value | [Outcome*sensitivity] or [Outcome^sensitivity] | Indirect evidence: Herzallah et al. (2013), Treadway et al. (2012) | ||
| Effort costs | Suppression of responding by effort | [Outcome value–effort cost] | Hershenberg et al. (2016), Treadway et al. (2012), Yang et al. (2016), Yang et al. (2014) | No simple increase in effort costs: Clery-Melin et al. (2011), Sherdell, Waugh, & Gotlib (2012) | |
| Working memory/“model-based” learning | Rapid adaptation of behavior in response to feedback | Various approaches, e.g., control choice in terms of previous outcome (Myers et al., 2016) | N/A | N/A | Little direct examination in MDD |
| Uncertainty-modulated learning | Increases or decreases in learning rate in response to uncertainty | Modulation of learning rate (e.g., Equation 2) by stimulus/outcome uncertainty | N/A | N/A | Little direct examination in MDD (but see Browning et al., 2015, on anxiety) |
[i] Note. Here we define indirect evidence as suggestive that the construct might be significant, but this was not assessed directly via a modeling or other analytic strategy. To complete this table, combinations of the following terms were used in systematic searches: reward, model-based learning, Pavlovian, exploration, decision, choice, punishment learning, with anhedonia or major depression. The goal of the table is to provide an overview of salient exemplars of existing data from studies incorporating depressed, dysphoric, or euthymic individuals, which may be particularly relevant for the constructs listed.

Figure 2.
Moderation of relationship between the risk aversion parameter (see Equation 5) and risk preference by temperature. High temperatures are red; low temperatures are blue. A low score on the risk aversion parameter amplifies the utility of small wins, leading to risk aversion, but this is only clearly manifest in behavior if the temperature is low. Likewise, a high score reduces the utility of small wins, leading to risk seeking, but again, only if the temperature is low.
Table 2.
Exploring reward processing in the striatum
| Study | Groups | Outcome magnitude | Probability (%) | Response contingent | Task length | Striatum differences | Reported null findings |
|---|---|---|---|---|---|---|---|
| Hagele et al. (2015) | AUD, SZ, MDD, BD (manic), ADHD, HC | ±€0.1, €0.6, €3 | 67 | Yes | 2 × 72 trials | Right VS: Increasing depression severity reduces reward anticipation vs. neutral | |
| Stoy et al. (2012) | MDD (before and during treatment), HC | €0.1, €0.6, €3 | 67 | Yes | 2 × 72 trials | VS: HC > MDD, reward and loss anticipation vs. neutral—partially recovers after treatment | |
| Knutson, Bhanji, Cooney, Atlas, & Gotlib (2008) | Unmedicated MDD, HC | ± $0.1, $0.2, $1, $5 | 67 | Yes (individually calibrated RT threshold) | 2 × 90 trials | Putamen: HC > MDD, reward outcome vs. neutral | VS: Reward anticipation |
| Admon et al. (2015) | MDD, HC | Variable: mean +$2.15, –$2 | 50 | No; instructed | 5 × 24 trials | Caudate: HC > MDD, reward and loss outcomes vs. neutral | |
| Wacker, Dillon, & Pizzagalli (2009) | Healthy individuals varying in anhedonic symptoms | Variable: mean +$2.15, –$2 | 50 | No; instructed | 5 × 24 trials | VS: Increasing anhedonia reduces reward outcome vs. neutral | VS: Reward anticipation |
| Pizzagalli et al. (2009) | MDD, HC | Variable: mean +$2.15, –$2 | 50 | No; instructed | 5 × 24 trials | Putamen: HC > MDD, reward anticipation vs. neutral; Caudate/VS: HC > MDD, reward outcome vs. neutral; | VS: Reward anticipation |
| Smoski, Rittenberg, & Dichter (2011) | MDD, HC | Money (+$1), IAPS pictures | 67 | Yes | 2 × 2 × 40 trials | Putamen: Anticipation Group × Reward Type interaction | Widespread anticipation-related activation; little outcome-related activation |
| Arrondo et al. (2015) | MDD, SZ, HC | High (£1), low (£0.01) | 70 high win, 30 low win | No; instructed | 30 win, 30 neutral trials | VS: HC > MDD/SZ, reward anticipation; relationship of VS anticipation activation with anhedonia in SZ, not MDD | |
| Dichter, Kozink, McClernon, & Smoski (2012) | Remitted MDD, HC | +$1 for wins | 67 | Yes | 20 potential win, 20 neutral | Caudate: remitted MDD > HC, reward anticipation | |
| Mori et al. (2016) | Students with/without subthreshold depression | ±¥0, ¥20, ¥100, ¥500 | N.S. | N.S. | 40 gain, 40 loss, 10 neutral | Differences not within striatum | VS: Reward anticipation |
| Misaki, Suzuki, Savitz, Drevets, & Bodurka (2016) | MDD, HC | ±$0.2, $1 | 66 | Yes (individually calibrated RT threshold) | 15 high win, 15 low win, 15 neutral, 15 high loss, 15 low loss | Left VS: HC > MDD during high win anticipation | No differences seen at low reward anticipation in left VS or low/high anticipation on right VS; no outcome- locked differences but overall activations not strong |
| Ubl et al. (2015) | Remitted MDD, HC | High (±€2), low (±€0.2) wins and losses | 50% (approx.) | Yes (individually calibrated RT threshold) | N.S. | Differences not within striatum | |
| Stringaris et al. (2015) | Clinical, subthreshold depression, HC (adolescent) | 10, 2, 0 points | 66 (approx.) | Yes (individually calibrated RT threshold) | 66 trials | VS: HC > clinical/ subthreshold depression, reward anticipation; reduced VS activation to reward anticipation also predicted transition to depression at 2-year follow-up and was related to symptoms of anhedonia. VS: Subthreshold depression > HC, positive outcomes Subthreshold depression and anhedonia > HC, negative outcomes |
[i] Note. Table summarizing design and findings of studies of MDD or other depression-related cohorts that employed a reward-based version of the MID task. ADHD = attention-deficit hyperactivity disorder. AUD = alcohol use disorder. BD = bipolar disorder. N.S. = not stated. RT = reaction time. The contents of the table represent all the studies we were able to find using systematic searches for monetary incentive delay fMRI studies. A recent study of Admon et al. (2017) was not included, as it was focused on a dopaminergic drug manipulation, but it also found significant group (control > MDD) differences in the VS coupled to outcomes in the placebo condition.
