Skip to main content
Have a personal or library account? Click to login
Anhedonic Traits Do Not Impair Performance in a 3-Arm Bandit Task Cover

Anhedonic Traits Do Not Impair Performance in a 3-Arm Bandit Task

Open Access
|Apr 2026

Abstract

Anhedonia, a transdiagnostic symptom marked by diminished reward sensitivity, is often linked to impairments in reinforcement learning (RL). Standard tasks (e.g., the 4-arm bandit) can place substantial demands on participants and may blur valuation with other processes. We therefore adapted a three-arm bandit (3AB) task from Seymour et al. (2012), incorporating design features intended to lessen task demands (fewer options; denser feedback) while enabling separate estimation of reward and punishment learning rates and sensitivities. In an online sample pre-screened for anhedonia (N = 206; 111 anhedonic, 95 non-anhedonic), hierarchical Bayesian modelling using a four-parameter specification showed no credible group differences in reward learning rate, punishment learning rate, reward sensitivity, or punishment sensitivity; Bayes factors favoured the null (BF01 = 3.36–5.96). Model-agnostic win-stay/lose-shift strategies likewise showed no group differences (Welch’s tests, all p > .05). Posterior predictive checks indicated above-chance choice prediction: the model’s highest-probability action matched participants’ actual choices on 59.6% of trials (chance = 33%). Parameter recovery was excellent for valuation parameters (r = 0.96–0.97) and acceptable for learning rates (r = 0.67–0.85). Simulations generated from fitted parameters preserved individual-difference structure, with high correlations between observed and simulated win-stay (r = 0.89 anhedonic; 0.86 non-anhedonic) and moderate correlations for lose-shift (r = 0.62; 0.67), alongside small systematic mean-level biases (simulated win-stay lower by 3.5–4.9 percentage points; simulated lose-shift higher by 12.8–13.2 points). Model comparison showed that lapse-augmented variants achieved marginally better predictive fit, but group comparisons under both lapse models yielded overlapping posteriors with 95% HDIs including zero for all learning, sensitivity, and lapse parameters, indicating that the null findings were robust to inclusion of lapse terms. Non-anhedonic participants also responded more slowly on average than anhedonic participants, which we treat as exploratory. Together, these results suggest that in this 3AB task, anhedonia is not reliably associated with differences in core RL parameters or simple choice strategies, while providing a detailed characterisation of model performance and limitations in an online setting.

DOI: https://doi.org/10.5334/cpsy.135 | Journal eISSN: 2379-6227
Language: English
Submitted on: Feb 4, 2025
Accepted on: Mar 10, 2026
Published on: Apr 13, 2026
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2026 Arjun Ramaswamy, Yumeya Yamamori, Umesh Vivekananda, Vladimir Litvak, Jonathan P. Roiser, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.