
Figure 1.
Probabilistic learning task. During training, each pair is presented separately. Participants have to select one of the two stimuli, slowly integrating “correct” and “incorrect” feedback (each stimulus has a unique probabilistic chance of being correct) to maximize their accuracy. The EEG activities reported here were taken following these feedbacks. During the testing phase, each stimulus is paired with all other stimuli, and participants must choose the best one, without the aid of feedback. Measures of reward and punishment learning are taken from the test phase, hypothesized to reflect the operations of a slow, probabilistic integrative system during training. Note that the letter and percentage are not presented to the participant, nor are the green boxes surrounding the choice.
Table 1.
Control (CTL) and depression (DEP) participant demographics, symptom scores, task performance, and best-fitting model parameters
| CTL | DEP | Statistic and p value | |
|---|---|---|---|
| Sex | χ2 = 5.08, p = 0.02 | ||
| Male | 35 | 12 | |
| Female | 40 | 34 | |
| Age (years) | 18.97 (1.22) | 18.74 (1.14) | t = 1.05, p = 0.30 |
| BDI | 1.73 (1.65) | 22.22 (4.90) | t = 33.32, p < 0.01 |
| TAI | 31.05 (5.49) | 55.76 (7.08) | t = 21.49, p < 0.01 |
| No. trials | 221 (115) | 256 (110) | t = 1.61, p = 0.11 |
| Train accuracy (%) | 66 (9) | 65 (9) | t = 0.52, p = 0.60 |
| Test accuracy (%) | 65 (11) | 66 (12) | t = 0.34, p = 0.74 |
| A > B accuracy (%) | 78 (28) | 78 (26) | t = 0.01, p = 0.99 |
| Go accuracy (%) | 64 (23) | 63 (23) | t = 0.06, p = 0.95 |
| NoGo accuracy (%) | 63 (23) | 64 (24) | t = 0.37, p = 0.71 |
| Gain learning rate | 0.26 (0.42) | 0.11 (0.32) | z = 2.67, p = 0.008 |
| Loss learning rate | 0.06 (0.26) | 0.01 (0.11) | z = 1.74, p = 0.08 |
| Softmax gain | 3.72 (4.38) | 5.12 (8.76) | z = −1.58, p = 0.11 |

Figure 2.
Within the high depressive symptomatology group, anxiety predicted the bias to learn from punishment. A) Anxiety correlated with the bias to learn more from NoGo than Go conditions; this effect was specific to the NoGo condition. B) Anxiety was related to nominally faster reaction times (RTs) for NoGo versus Go conditions, again entirely due to faster RTs on NoGo but not Go conditions. Together, these findings suggest that greater levels of anxiety were specifically related to a punishment learning bias.

Figure 3.
Event-related potentials (ERPs) time locked to incorrect (red) and correct (green) feedback at electrode FCz. A) Punishers (incorrect feedback) led to a N2–P3 complex, identified with vertical dashed lines. The P3–N2 difference was not different between groups, and against predictions, it did not have a strong correlation with anxiety (all scatterplots show relationships within the DEP group). B) Rewards (correct feedback) led to a reward positivity (Rew-P), identified with vertical dashed lines. The Rew-P did not differ between groups, but depression predicted a strong diminishment in amplitude; this relationship was differentiated from all other symptom–ERP correlations (magenta arrows). C) Correlation between single-trial absolute negative prediction error (|−PE|) and EEG activity, yielding a strong relationship in the same temporal domain as the P3–N2 complex. The average correlation between the P3 and N2 time points was significantly related to anxiety; this relationship was differentiated from all other correlations between symptomatology and PE–EEG coupling (magenta arrows). D) Correlation between single-trial positive prediction error (+PE) and EEG activity, yielding a strong relationship in the same temporal domain as the Rew-P. There were no relationships between depression and PE–EEG coupling. Horizontal cyan bars in C and D show significant t test outcomes from zero (uncorrected), demonstrating that PE–EEG coupling is robust in these previously identified component-specific time windows. All topoplots are raw values from the identified time windows. *p < 0.05. **p < 0.01.

Figure 4.
Time–frequency plots for punishment and reward at electrode FCz. A–B) Control (CTL) group: punishment led to theta-band activity and reward led to delta-band activity. Tf-ROIs are outlined in cyan boxes. C) DEP group: correlation between punishment power and anxiety. Contours show significant correlations in the same region as the tf-ROI. D) DEP group: correlation between reward power and depression. Contours show significant correlations in a slightly earlier region, yet partially overlapping with the tf-ROI. E) DEP group: correlation between punishment theta power tf-ROI and learning bias (Go–NoGo). This pattern was due to a link between punishment theta and NoGo learning (red), not Go learning (green).

Figure 5.
Source estimation of prediction errors and symptomatology influence. A) CTL group: the N2–P3 complex was specifically highlighted by the pairwise contrast of high versus low prediction errors; differences in this time window were localized to dorsal cingulate and pre-SMA. In the DEP group, anxiety was correlated with activity in these same areas. B) CTL group: the Rew-P was specifically highlighted by the pairwise contrast of high versus low prediction errors; differences in this time window were localized to orbitofrontal areas. In the DEP group, depression was correlated with activity in this same area. C–D) DEP group: ERPs across the midline from individuals high versus low in depressive symptomatology; the FPz lead (red) displays the largest morphological difference. E) DEP group: scalp maps and source estimations from consecutive 50 ms time windows, showing significant correlations with depressive symptomatology over the time course of the Rew-P. Although the topographic relationships move posteriorly, the major source-estimated influence remains a diminishment in orbitofrontal activity following reward.
