References
- 1Akam, T., Costa, R., & Dayan, P. (2015). Simple plans or sophisticated habits? State, transition and learning interactions in the two-step task. PLoS Computational Biology, 11(12),
e1004648 . 10.1371/journal.pcbi.1004648 - 2Ben-Artzi, I., Luria, R., & Shahar, N. (2022). Working memory capacity estimates moderate value learning for outcome-irrelevant features. Scientific Reports, 12,
19677 . 10.1038/s41598-022-21832-x - 3Church, B. A., Jackson, B. N., & Smith, J. D. (2021). Exploring explicit learning strategies: A dissociative framework for research. New Ideas in Psychology, 60,
100817 . 10.1016/j.newideapsych.2020.100817 - 4Daw, N. D., Gershman, S. J., Seymour, B., et al (2011). Model-based influences on humans’ choices and striatal prediction errors. Neuron, 69(6), 1204–1215. 10.1016/j.neuron.2011.02.027
- 5Howard, R. A. (1960). Dynamic programming and Markov processes. The MIT press.
- 6Jocham, G., Brodersen, K. H. H., Constantinescu, A. O. O., et al. (2016). Reward-guided learning with and without causal attribution. Neuron, 90(1), 177–190. 10.1016/j.neuron.2016.02.018
- 7Lehmann, M. P., Xu, H. A., Liakoni, V., et al. (2019). One-shot learning and behavioral eligibility traces in sequential decision making. eLife, 8.
e47463 . 10.7554/eLife.47463 - 8Minsky, M. (1961). Steps toward artificial intelligence. Proceedings of the IRE, 49(1), 8–30. 10.1109/JRPROC.1961.287775
- 9Miranda, B., Nishantha Malalasekera, W. M., Behrens, T. E., et al. (2020). Combined model-free and model-sensitive reinforcement learning in non-human primates. PLoS Computational Biology, 16(6),
e1007944 . 10.1371/journal.pcbi.1007944 - 10Sato, Y., Sakai, Y., & Hirata, S. (2023). State-transition-free reinforcement learning in chimpanzees (Pan troglodytes). Learning & Behavior, 51(4), 413–427. 10.3758/s13420-023-00591-3
- 11Shahar, N., Hauser, T. U., & Moran, R., et al. (2021). Assigning the right credit to the wrong action: compulsivity in the general population is associated with augmented outcome-irrelevant value-based learning. Translational Psychiatry, 11,
564 . 10.1038/s41398-021-01642-x - 12Shahar, N., Moran, R., & Hauser, T. U., et al. (2019). Credit assignment to state-independent task representations and its relationship with model-based decision making. Proceedings of the National Academy of Sciences, 116(32), 15871–15876. 10.1073/pnas.1821647116
- 13Singh, S. P., & Sutton, R. S. (1996). Reinforcement learning with replacing eligibility traces. Machine Learning, 22, 123–158. 10.1007/BF00114726
- 14Skinner, B. F. (1938). The behavior of organisms: an experimental analysis. Appleton-Century.
- 15Smith, J. D., Jamani, S., Boomer, J., & Church, B. A. (2018). One-back reinforcement dissociates implicit-procedural and explicit-declarative category learning. Memory & Cognition, 46(2), 261–273. 10.3758/s13421-017-0762-8
- 16Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: an introduction, 2nd edn. The MIT press.
- 17Tanaka, S. C., Shishida, K., & Schweighofer, N., et al. (2009). Serotonin affects association of aversive outcomes to past actions. Journal of Neuroscience, 29(50), 15669–15674. 10.1523/JNEUROSCI.2799-09.2009
- 18Walsh, M. M., & Anderson, J. R. (2011). Learning from delayed feedback: Neural responses in temporal credit assignment. Cognitive, Affective, & Behavioral Neuroscience, 11(2), 131–143. 10.3758/s13415-011-0027-0
- 19Zentall, T. R., Mueller, P. M., & Peng, D. N. (2023). 1-Back reinforcement symbolic-matching by humans: How do they learn it? Learning & Behavior, 51(3), 274–280. 10.3758/s13420-022-00558-w
