Have a personal or library account? Click to login
Improving the Reliability of the Pavlovian Go/No-Go Task for Computational Psychiatry Research Cover

Improving the Reliability of the Pavlovian Go/No-Go Task for Computational Psychiatry Research

Open Access
|Dec 2025

Figures & Tables

cpsy-9-1-127-g1.png
Figure 1

(A) Schematic of the Pavlovian go/no-go task. On each trial, a robot entered the ‘scanner’ from the left of screen, prompting a response (go or no-go) from the participant during a response window (Experiment 1: 1.5 seconds; Experiment 2: 1.3 seconds). The outcome (number of points won or lost) was subsequently presented on the scanner display (Experiment 1: 1.0 seconds; Experiment 2: 1.2 seconds), followed by an inter-trial interval animation (1 second) in which the conveyor belt carried the old robot out of view and a new robot into the scanner. The color of the scanner light denoted outcome domain (e.g., blue denoting reward and red denoting punishment). (B) The four trial types, produced by a factorial combination of outcome domain (rewarding, punishing) and correct action (go, no-go). (C) Outcome probabilities for each outcome domain following a correct or incorrect response. Correct responses yielded the better of the two possible outcomes with 80% chance. (D) Trial composition. In Experiment 1, participants saw 8 total robots (two of each trial type), each presented for 30 trials (240 total trials). In Experiment 2, participants saw 24 total robots (6 of each trial type), each for 8, 10, or 12 trials (240 total trials).

cpsy-9-1-127-g2.png
Figure 2

Large practice effects on the standard Pavlovian go/no-go task in Experiment 1. (A) Group-averaged learning curves for each trial type and session. Shaded regions indicate 95% bootstrapped confidence intervals. (B) Group-averaged performance for each session. Performance measures from left-to-right: Correct responses, or overall accuracy; Go bias, or difference in accuracy between Go and No-Go trials; Congruence effect, or difference in accuracy between congruent (GW, NGAL) and incongruent (NGW, GAL) trials; and Feedback sensitivity, or the difference in accuracy on trials following veridical and sham feedback. Behavior on the first session was significantly different from all other sessions on all measures. ** Denotes significant pairwise difference (p<0.05, corrected for multiple comparisons). (C) Distribution of correct responses across sessions by trial type. Percentage of participants, for each session and trial type, exhibiting at- or below-chance performance (<60% response accuracy; grey), intermediate performance (60% response accuracy; light blue), or near-perfect performance (90% response accuracy; dark blue). Across sessions, performance improved on all trial types that were not already close to ceiling on the first session.

Table 1

Model comparison collapsing across sessions. Accuracy = trial-level choice prediction accuracy between observed and model-predicted Go responses. PSIS-LOO = approximate leave-one-out cross-validation scores presented in deviance scale (smaller numbers indicate better fit). ΔPSIS-LOO = difference in PSIS-LOO values between each model and the best-fitting model (M7).

MODELPARAMETERSACCURACYPSIS-LOOΔPSIS-LOO (se)
M1β,η87.5%–151457.9–5602.6 (68.3)
M2β,τ,η89.0%–154011.9–3048.6 (51.2)
M3β,τ+,τ,η89.8%–155817.8–1242.7 (31.3)
M4β+,β,τ+,τ,η89.8%–156261.6–798.8 (22.6)
M5β,τ+,τ,η+,η89.9%–156265.9–794.6 (20.7)
M6β+,βτ+,τ,η+,η89.9%–156401.8–658.6 (18.8)
M7β+,βτ+,τ,η+,η,ξ90.1%–157060.5
cpsy-9-1-127-g3.png
Figure 3

Reinforcement learning model parameters in Experiment 1 show evidence of practice effects and low reliability. (A) Group-level model parameters for each session. Error bars indicate 95% Bayesian confidence intervals (CIs). ** Denotes pairwise comparison where 95% CI of the difference excludes zero. (B) Test-retest reliability estimates for each model parameter. Dotted lines indicate average across pairs of sessions. Shaded region indicates conventional range of acceptable reliability (ρ ≥ 0.7). (C) Test-retest reliability estimates for each model parameter using ICC. Dotted lines indicate average across the three sessions. Shaded region indicates conventional range of good reliability (rICC ≥ 0.6).

cpsy-9-1-127-g4.png
Figure 4

Smaller or no practice effects on the modified Pavlovian go/no-go task in Experiment 2. (A) Group-averaged learning curves for each trial type and session. Shaded regions indicate 95% bootstrapped confidence intervals. (B) Group-averaged performance for each session. Performance indices from left-to-right: Correct responses, or overall accuracy; Go bias, or difference in accuracy between Go and No-Go trials; Congruency effect, or difference in accuracy between Pavlovian congruent (GW, NGAL) and incongruent (NGW, GAL) trials; and Feedback sensitivity, or the difference in accuracy on trials following veridical and sham feedback. ** Denotes significant pairwise difference (p<0.05, corrected for multiple comparisons). (C) The percentage of participants, for each session and trial type, exhibiting at- or below-chance performance (<60% response accuracy; grey), intermediate performance (60% and <90% response accuracy; light blue), or near-perfect performance (90% response accuracy; dark blue).

Table 2

Model comparison collapsing across sessions. Accuracy = trial-level choice prediction accuracy between observed and model-predicted Go responses. PSIS-LOO = approximate leave-one-out cross-validation presented in deviance scale (smaller numbers indicate better fit). ΔPSIS-LOO = difference in PSIS-LOO values between each model and the best-fitting model (M7).

MODELPARAMETERSACCURACYPSIS-LOOΔPSIS-LOO (se)
M1β,η72.9%–95806.3–6205.2 (73.2)
M2β,τ,η76.5%–99616.0–2395.5 (48.9)
M3β,τ+,τ,η77.6%–101283.0–728.5 (28.2)
M4β+,β,τ+,τ,η77.5%–101422.4–589.0 (21.1)
M5β,τ+,τ,η+,η77.7%–101519.0–492.4 (19.1)
M6β+,βτ+,τ,η+,η77.8%–101548.7–462.7 (17.2)
M7β+,βτ+,τ,η+,η,ξ78.1%–102011.4
cpsy-9-1-127-g5.png
Figure 5

Reinforcement learning model parameters in Experiment 2 show improved stability and reliability. (A) Group-level model parameters for each session. Error bars indicate 95% Bayesian confidence intervals (CIs). ** Denotes pairwise comparison where 95% CI of the difference excludes zero. (B) Test-retest reliability estimates for each model parameter. Filled circles denote estimates for Experiment 2; open circles denote estimates from Experiment 1, for comparison. Grey vertical lines show the change in reliability across experiments. Dotted lines indicates average reliability for Experiment 2. Shaded region indicates conventional range of acceptable reliability (ρ ≥ 0.7). (C) Test-retest reliability estimates for each model parameter using ICC. Dotted lines indicate average across pairs of sessions. Shaded region indicates conventional range of good reliability (rICC ≥ 0.6).

DOI: https://doi.org/10.5334/cpsy.127 | Journal eISSN: 2379-6227
Language: English
Submitted on: Sep 2, 2024
Accepted on: Nov 4, 2025
Published on: Dec 18, 2025
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2025 Samuel Zorowitz, Gili Karni, Natalie Paredes, Nathaniel Daw, Yael Niv, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.