
Figure 1
Task schematic. On each trial, subjects selected one of three different actions conditional on a presented stimulus, and were then presented with deterministic reward feedback (correct/incorrect). The number of stimuli (the set size) varied across blocks.

Figure 2
The reward-complexity trade-off. (A–E) Each panel shows the optimal reward-complexity curve (solid line) for a given set size, along with the empirical reward-complexity values (circles) for each subject (HC = healthy controls; SZ = schizophrenia patients). Policy complexity is measured in natural units of information (nats). A uniform distribution over actions corresponds to a policy complexity of 0. Note that because the optimal curve is constructed under the assumption of exact knowledge about the deterministic reward contingencies, the curve will always saturate at an accuracy of 1. (F) Policy complexity as a function of set size. Error bars show 95% confidence intervals.

Figure 3
Bias differs between healthy control and schizophrenia patients. Bias is defined as the gap between the optimal and empirical reward-complexity curves. (A) Bias is larger for the schizophrenia group than for the healthy control group. Error bars show 95% confidence intervals. (B) Bias is negatively correlated with policy complexity. The correlation coefficient does not differ significantly between subject groups.

Figure 4
Polynomial regression modeling of empirical reward-complexity curves. (A–E) Regression coefficients do not differ between the healthy control and schizophrenia groups. Error bars show 95% confidence intervals. (F) The difference in the Bayesian information criterion (BIC) between the independent and joint regression models. Positive values favor the joint regression model.

Figure 5
The reward-complexity trade-off for simulated cost-sensitive agents. (A–E) Each panel shows the optimal reward-complexity curve (solid line) for a given set size, along with the simulated reward-complexity values (circles) for each subject (HC = healthy controls; SZ = schizophrenia patients). (F) Policy complexity as a function of set size. Error bars show 95% confidence intervals.

Figure 6
Bias differs between simulated healthy control and schizophrenia patients. (A) Bias is larger for the simulated schizophrenia group than for the healthy control group. Error bars show 95% confidence intervals. (B) Bias is negatively correlated with policy complexity. The correlation coefficient does not differ significantly between subject groups.

Figure 7
Relationship between bias and actor-critic parameters. (A) Inverse temperature. (B) Actor learning rate. (C) Marginal action probability learning rate. HC = healthy controls; SZ = schizophrenia patients. Lines show least-squares fits. The actor learning rate was lower and the inverse temperature was higher for HC compared to SZ. Both the actor and marginal action probability learning rates were positively associated with bias.

Figure 8
Empirical and theoretical learning curves. (A) Healthy controls. (B) Schizophrenia patients. (C) Simulated healthy controls. (D) Simulated schizophrenia patients.

Figure 9
Optimal inverse temperature for a given capacity. Circles show the average empirical policy complexity for each set size.
