Have a personal or library account? Click to login
Leveraging large language models to estimate clinically relevant psychological constructs in psychotherapy transcripts Cover

Leveraging large language models to estimate clinically relevant psychological constructs in psychotherapy transcripts

Open Access
|Sep 2025

References

  1. 1Abdou, M., Kulmizev, A., Hershcovich, D., Frank, S., Pavlick, E., & Søgaard, A. (2021). Can language models encode perceptual structure without grounding? A case study in color. arXiv preprint arXiv:2109.06129. 10.18653/v1/2021.conll-1.9
  2. 2Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F. L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al. (2023). Gpt-4 technical report. arXiv preprint arXiv:2303.08774.
  3. 3AI@Meta (2024). Llama 3 model card.
  4. 4Allport, G. W. (1942). The use of personal documents in psychological science. Social Science Research Council Bulletin.
  5. 5Allport, G. W., & Vernon, P. E. (1930). The field of personality. Psychological bulletin, 27(10), 677. 10.1037/h0072589
  6. 6Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 148. 10.18637/jss.v067.i01
  7. 7Beck, J. S. (2020). Cognitive behavior therapy: Basics and beyond. Guilford Publications.
  8. 8Ben-Shachar, M. S., Lüdecke, D., & Makowski, D. (2020). effectsize: Estimation of effect size indices and standardized parameters. Journal of Open Source Software, 5(56), 2815. 10.21105/joss.02815
  9. 9Blanco-Cuaresma, S. (2024). Psychological assessments with large language models: A privacy-focused and cost-effective approach. arXiv preprint arXiv:2402.03435.
  10. 10Bolger, N. (2013). Intensive longitudinal methods: An introduction to diary and experience sampling research. The Guilford Press.
  11. 11Brans, K., Van Mechelen, I., Rimé, B., & Verduyn, P. (2014). To share, or not to share? Examining the emotional consequences of social sharing in the case of anger and sadness. Emotion, 14(6), 1062. 10.1037/a0037604
  12. 12Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al. (2020). Language models are few-shot learners. Advances in neural information processing systems, 33, 18771901.
  13. 13Burkhardt, H., Pullmann, M., Hull, T., Areán, P., & Cohen, T. (2022). Comparing emotion feature extraction approaches for predicting depression and anxiety. In Proceedings of the eighth workshop on computational linguistics and clinical psychology (pp. 105115). 10.18653/v1/2022.clpsych-1.9
  14. 14Bürkner, P.-C. (2017). brms: An r package for bayesian multilevel models using stan. Journal of statistical software, 80, 128. 10.18637/jss.v080.i01
  15. 15Chim, J., Tsakalidis, A., Gkoumas, D., Atzil-Slonim, D., Ophir, Y., Zirikly, A., Resnik, P., & Liakata, M. (2024a). Overview of the clpsych 2024 shared task: Leveraging large language models to identify evidence of suicidality risk in online posts. In Proceedings of the 9th Workshop on Computational Linguistics and Clinical Psychology (CLPsych 2024) (pp. 177190).
  16. 16Chim, J., Tsakalidis, A., Gkoumas, D., Atzil-Slonim, D., Ophir, Y., Zirikly, A., Resnik, P., & Liakata, M. (2024b). Overview of the CLPsych 2024 shared task: Leveraging large language models to identify evidence of suicidality risk in online posts. In A. Yates, B. Desmet, E. Prud’hommeaux, A. Zirikly, S. Bedrick, S. MacAvaney, K. Bar, M. Ireland, & Y. Ophir (Eds.), Proceedings of the 9th Workshop on Computational Linguistics and Clinical Psychology (CLPsych 2024) (pp. 177190). St. Julians, Malta: Association for Computational Linguistics.
  17. 17Cohen, K. A., Shroff, A., Nook, E. C., & Schleider, J. L. (2022). Linguistic distancing predicts response to a digital single-session intervention for adolescent depression. Behaviour Research and Therapy, 159, 104220. 10.1016/j.brat.2022.104220
  18. 18Cunningham, H., Ewart, A., Riggs, L., Huben, R., & Sharkey, L. (2023). Sparse autoencoders find highly interpretable features in language models. arXiv preprint arXiv:2309.08600.
  19. 19Demszky, D., Yang, D., Yeager, D. S., Bryan, C. J., Clapper, M., Chandhok, S., Eichstaedt, J. C., Hecht, C., Jamieson, J., Johnson, M., et al. (2023). Using large language models in psychology. Nature Reviews Psychology, 2(11), 688701. 10.1038/s44159-023-00241-5
  20. 20Dercon, Q., Mehrhof, S. Z., Sandhu, T. R., Hitchcock, C., Lawson, R. P., Pizzagalli, D. A., Dalgleish, T., & Nord, C. L. (2024). A core component of psychological therapy causes adaptive changes in computational learning mechanisms. Psychological Medicine, 54(2), 327337. 10.1017/S0033291723001587
  21. 21Dubey, A., Jauhri, A., Pandey, A., Kadian, A., Al-Dahle, A., Letman, A., Mathur, A., Schelten, A., Yang, A., Fan, A., et al. (2024). The llama 3 herd of models. arXiv preprint arXiv:2407.21783.
  22. 22Edwards, L. J., Muller, K. E., Wolfinger, R. D., Qaqish, B. F., & Schabenberger, O. (2008). An r2 statistic for fixed effects in the linear mixed model. Statistics in medicine, 27(29), 61376157. 10.1002/sim.3429
  23. 23Eichstaedt, J. C., Kern, M. L., Yaden, D. B., Schwartz, H. A., Giorgi, S., Park, G., Hagan, C. A., Tobolsky, V. A., Smith, L. K., Buffone, A., et al. (2021). Closed-and open-vocabulary approaches to text analysis: A review, quantitative comparison, and recommendations. Psychological Methods, 26(4), 398. 10.1037/met0000349
  24. 24Elhage, N., Nanda, N., Olsson, C., Henighan, T., Joseph, N., Mann, B., Askell, A., Bai, Y., Chen, A., Conerly, T., et al. (2021). A mathematical framework for transformer circuits. Transformer Circuits Thread, 1(1), 12.
  25. 25Ferrando, J., Sarti, G., Bisazza, A., & Costa-jussà, M. R. (2024). A primer on the inner workings of transformer-based language models. arXiv preprint arXiv:2405.00208.
  26. 26Freud, S. (1966). Psychopathology of everyday life. WW Norton & Company.
  27. 27Gillan, C. M., Kosinski, M., Whelan, R., Phelps, E. A., & Daw, N. D. (2016). Characterizing a psychiatric symptom dimension related to deficits in goal-directed control. elife, 5, e11305. 10.7554/eLife.11305
  28. 28Gottschalk, L. A., & Gleser, G. C. (2022). The measurement of psychological states through the content analysis of verbal behavior. Univ of California Press. 10.2307/jj.8362616
  29. 29Gurnee, W., Nanda, N., Pauly, M., Harvey, K., Troitskii, D., & Bertsimas, D. (2023). Finding neurons in a haystack: Case studies with sparse probing. arXiv preprint arXiv:2305.01610.
  30. 30Holmes, D., Alpers, G. W., Ismailji, T., Classen, C., Wales, T., Cheasty, V., Miller, A., & Koopman, C. (2007). Cognitive and emotional processing in narratives of women abused by intimate partners. Violence against women, 13(11), 11921205. 10.1177/1077801207307801
  31. 31Jackson, J. C., Watts, J., List, J.-M., Puryear, C., Drabble, R., & Lindquist, K. A. (2022). From text to thought: How analyzing language can advance psychological science. Perspectives on Psychological Science, 17(3), 805826. 10.1177/17456916211004899
  32. 32Jaeger, B. C., Edwards, L. J., Das, K., & Sen, P. K. (2017). An r 2 statistic for fixed effects in the generalized linear mixed model. Journal of applied statistics, 44(6), 10861105. 10.1080/02664763.2016.1193725
  33. 33Jeon, H., Yoo, D., Lee, D., Son, S., Kim, S., & Han, J. (2024). A dual-prompting for interpretable mental health language models. In A. Yates, B. Desmet, E. Prud’hommeaux, A. Zirikly, S. Bedrick, S. MacAvaney, K. Bar, M. Ireland, & Y. Ophir (Eds.), Proceedings of the 9th Workshop on Computational Linguistics and Clinical Psychology (CLPsych 2024) (pp. 247255). St. Julians, Malta: Association for Computational Linguistics.
  34. 34Kacewicz, E., Pennebaker, J. W., Davis, M., Jeon, M., & Graesser, A. C. (2014). Pronoun use reflects standings in social hierarchies. Journal of Language and Social Psychology, 33(2), 125143. 10.1177/0261927X13502654
  35. 35Kahn, J. H., Tobin, R. M., Massey, A. E., & Anderson, J. A. (2007). Measuring emotional expression with the linguistic inquiry and word count. The American journal of psychology, 120(2), 263286. 10.2307/20445398
  36. 36Kaplan, J., McCandlish, S., Henighan, T., Brown, T. B., Chess, B., Child, R., Gray, S., Radford, A., Wu, J., & Amodei, D. (2020). Scaling laws for neural language models. arXiv preprint arXiv:2001.08361.
  37. 37Kroenke, K., Strine, T. W., Spitzer, R. L., Williams, J. B., Berry, J. T., & Mokdad, A. H. (2009). The phq-8 as a measure of current depression in the general population. Journal of affective disorders, 114(1–3), 163173. 10.1016/j.jad.2008.06.026
  38. 38Kross, E., Bruehlman-Senecal, E., Park, J., Burson, A., Dougherty, A., Shablack, H., Bremner, R., Moser, J., & Ayduk, O. (2014). Self-talk as a regulatory mechanism: how you do it matters. Journal of personality and social psychology, 106(2), 304. 10.1037/a0035173
  39. 39Kross, E., Vickers, B. D., Orvell, A., Gainsburg, I., Moran, T. P., Boyer, M., Jonides, J., Moser, J., & Ayduk, O. (2017). Third-person self-talk reduces ebola worry and risk perception by enhancing rational thinking. Applied Psychology: Health and Well-Being, 9(3), 387409. 10.1111/aphw.12103
  40. 40Kuznetsova, A., Brockhoff, P. B., & Christensen, R. H. B. (2017). lmerTest package: Tests in linear mixed effects models. Journal of Statistical Software, 82(13), 126. 10.18637/jss.v082.i13
  41. 41Laffal, J. (1964). Freud’s theory of language. The Psychoanalytic Quarterly, 33(2), 157175. 10.1080/21674086.1964.11926307
  42. 42Leshed, G., Hancock, J. T., Cosley, D., McLeod, P. L., & Gay, G. (2007). Feedback for guiding reflection on teamwork practices. In Proceedings of the 2007 ACM International Conference on Supporting Group Work (pp. 217220). 10.1145/1316624.1316655
  43. 43Liu, Q., Wang, W., & Willard, J. (2025). Effects of prompt length on domain-specific tasks for large language models. arXiv preprint arXiv:2502.14255.
  44. 44Long, D. X., Dinh, D., Nguyen, N.-H., Kawaguchi, K., Chen, N. F., Joty, S., & Kan, M.-Y. (2025). What makes a good natural language prompt? arXiv preprint arXiv:2506.06950. 10.18653/v1/2025.acl-long.292
  45. 45Low, D. M., Rumker, L., Talkar, T., Torous, J., Cecchi, G., & Ghosh, S. S. (2020). Natural language processing reveals vulnerable mental health support groups and heightened health anxiety on reddit during covid-19: Observational study. Journal of medical Internet research, 22(10), e22635. 10.2196/22635
  46. 46Malgaroli, M., Hull, T. D., Zech, J. M., & Althoff, T. (2023). Natural language processing for mental health interventions: a systematic review and research framework. Translational Psychiatry, 13(1), 309. 10.1038/s41398-023-02592-2
  47. 47Mangalik, S., Eichstaedt, J. C., Giorgi, S., Mun, J., Ahmed, F., Gill, G., Ganesan, A. V., Subrahmanya, S., Soni, N., Clouston, S. A., et al. (2024). Robust language-based mental health assessments in time and space through social media. NPJ Digital Medicine, 7(1), 109. 10.1038/s41746-024-01100-0
  48. 48Marjieh, R., Sucholutsky, I., van Rijn, P., Jacoby, N., & Griffiths, T. L. (2024). Large language models predict human sensory judgments across six modalities. Scientific Reports, 14(1), 21445. 10.1038/s41598-024-72071-1
  49. 49Moran, T., & Eyal, T. (2022). Emotion regulation by psychological distance and level of abstraction: Two meta-analyses. Personality and Social Psychology Review, 26(2), 112159. 10.1177/10888683211069025
  50. 50Newman, M. L., Groom, C. J., Handelman, L. D., & Pennebaker, J. W. (2008). Gender differences in language use: An analysis of 14,000 text samples. Discourse processes, 45(3), 211236. 10.1080/01638530802073712
  51. 51Newman, M. L., Pennebaker, J. W., Berry, D. S., & Richards, J. M. (2003). Lying words: Predicting deception from linguistic styles. Personality and social psychology bulletin, 29(5), 665675. 10.1177/0146167203029005010
  52. 52Newman, M. W. (2022). Value added? A pragmatic analysis of the routine use of phq-9 and gad-7 scales in primary care. General Hospital Psychiatry, 79, 1518. 10.1016/j.genhosppsych.2022.09.005
  53. 53Nils, F., & Rimé, B. (2012). Beyond the myth of venting: Social sharing modes determine the benefits of emotional disclosure. European Journal of Social Psychology, 42(6), 672681. 10.1002/ejsp.1880
  54. 54Nook, E. C., Ahn, H. E., Schleider, J. L., & Somerville, L. H. (2024). Emotion regulation is associated with increases in linguistic measures of both psychological distancing and abstractness. Affective Science, 114. 10.31219/osf.io/a2zv3
  55. 55Nook, E. C., Hull, T. D., Nock, M. K., & Somerville, L. H. (2022). Linguistic measures of psychological distance track symptom levels and treatment outcomes in a large set of psychotherapy transcripts. Proceedings of the National Academy of Sciences, 119(13), e2114737119. 10.1073/pnas.2114737119
  56. 56Nook, E. C., Nardini, C., Zacharek, S. J., Hommel, G., Spencer, H., Martino, A., Morra, A., Flores, S., Anderson, T., Marin, C. E., et al. (2023). Affective language spreads between anxious children and their mothers during a challenging puzzle task. Emotion, 23(6), 1513. 10.1037/emo0001203
  57. 57Nook, E. C., Schleider, J. L., & Somerville, L. H. (2017). A linguistic signature of psychological distancing in emotion regulation. Journal of Experimental Psychology: General, 146(3), 337. 10.1037/xge0000263
  58. 58Nook, E. C., Vidal Bustamante, C. M., Cho, H. Y., & Somerville, L. H. (2020). Use of linguistic distancing and cognitive reappraisal strategies during emotion regulation in children, adolescents, and young adults. Emotion, 20(4), 525. 10.1037/emo0000570
  59. 59Orvell, A., Vickers, B. D., Drake, B., Verduyn, P., Ayduk, O., Moser, J., Jonides, J., & Kross, E. (2021). Does distanced self-talk facilitate emotion regulation across a range of emotionally intense experiences? Clinical Psychological Science, 9(1), 6878. 10.1177/2167702620951539
  60. 60Pennebaker, J. W. (2001). Linguistic inquiry and word count: Liwc 2001.
  61. 61Pennebaker, J. W., & Stone, L. D. (2003). Words of wisdom: language use over the life span. Journal of personality and social psychology, 85(2), 291. 10.1037/0022-3514.85.2.291
  62. 62Rathje, S., Mirea, D.-M., Sucholutsky, I., Marjieh, R., Robertson, C. E., & Van Bavel, J. J. (2024). Gpt is an effective tool for multilingual psychological text analysis. Proceedings of the National Academy of Sciences, 121(34), e2308950121. 10.1073/pnas.2308950121
  63. 63Razykov, I., Ziegelstein, R. C., Whooley, M. A., & Thombs, B. D. (2012). The phq-9 versus the phq-8—is item 9 useful for assessing suicide risk in coronary artery disease patients? Data from the heart and soul study. Journal of psychosomatic research, 73(3), 163168. 10.1016/j.jpsychores.2012.06.001
  64. 64Rimé, B. (2009). Emotion elicits the social sharing of emotion: Theory and empirical review. Emotion review, 1(1), 6085. 10.1177/1754073908097189
  65. 65Sahi, R. S., He, Z., Silvers, J. A., & Eisenberger, N. I. (2023). One size does not fit all: Decomposing the implementation and differential benefits of social emotion regulation strategies. Emotion, 23(6), 1522. 10.1037/emo0001194
  66. 66Sahi, R. S., Ninova, E., & Silvers, J. A. (2021). With a little help from my friends: Selective social potentiation of emotion regulation. Journal of Experimental Psychology: General, 150(6), 1237. 10.1037/xge0000853
  67. 67Sharma, A., Rushton, K., Lin, I. W., Nguyen, T., & Althoff, T. (2024). Facilitating self-guided mental health interventions through human-language model interaction: A case study of cognitive restructuring. In Proceedings of the CHI Conference on Human Factors in Computing Systems (pp. 129). 10.1145/3613904.3642761
  68. 68Shin, C., Lee, S.-H., Han, K.-M., Yoon, H.-K., & Han, C. (2019). Comparison of the usefulness of the phq-8 and phq-9 for screening for major depressive disorder: analysis of psychiatric outpatient data. Psychiatry investigation, 16(4), 300. 10.30773/pi.2019.02.01
  69. 69So, J.-h., Chang, J., Kim, E., Na, J., Choi, J., Sohn, J.-y., Kim, B.-H., Chu, S. H., et al. (2024). Aligning large language models for enhancing psychiatric interviews through symptom delineation and summarization: Pilot study. JMIR Formative Research, 8(1), e58418. 10.2196/58418
  70. 70Tausczik, Y. R., & Pennebaker, J. W. (2010). The psychological meaning of words: Liwc and computerized text analysis methods. Journal of language and social psychology, 29(1), 2454. 10.1177/0261927X09351676
  71. 71Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al. (2023). Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
  72. 72Trope, Y., & Liberman, N. (2010). Construal-level theory of psychological distance. Psychological review, 117(2), 440. 10.1037/a0018963
  73. 73Uluslu, A. Y., Michail, A., & Clematide, S. (2024). Utilizing large language models to identify evidence of suicidality risk through analysis of emotionally charged posts. In Proceedings of the 9th Workshop on Computational Linguistics and Clinical Psychology (CLPsych 2024) (pp. 264269).
  74. 74Wang, Y., Inkpen, D., & Gamaarachchige, P. K. (2024). Explainable depression detection using large language models on social media data. In Proceedings of the 9th Workshop on Computational Linguistics and Clinical Psychology (CLPsych 2024) (pp. 108126).
  75. 75Wise, T., Robinson, O. J., & Gillan, C. M. (2023). Identifying transdiagnostic mechanisms in mental health using computational factor modeling. Biological Psychiatry, 93(8), 690703. 10.1016/j.biopsych.2022.09.034
  76. 76Yang, K., Zhang, T., Kuang, Z., Xie, Q., Huang, J., & Ananiadou, S. (2024). Mentallama: interpretable mental health analysis on social media with large language models. In Proceedings of the ACM Web Conference 2024 (pp. 44894500). 10.1145/3589334.3648137
  77. 77Zaki, J., & Williams, W. C. (2013). Interpersonal emotion regulation. Emotion, 13(5), 803. 10.1037/a0033839
  78. 78Zhao, W. X., Zhou, K., Li, J., Tang, T., Wang, X., Hou, Y., Min, Y., Zhang, B., Zhang, J., Dong, Z., et al. (2023). A survey of large language models. arXiv preprint arXiv:2303.18223.
  79. 79Zuromski, K. L., Low, D. M., Jones, N. C., Kuzma, R., Kessler, D., Zhou, L., Kastman, E. K., Epstein, J., Madden, C., Ghosh, S. S., et al. (2024). Detecting suicide risk among us servicemembers and veterans: a deep learning approach using social media data. Psychological Medicine, 110. 10.1017/S0033291724001557
DOI: https://doi.org/10.5334/cpsy.141 | Journal eISSN: 2379-6227
Language: English
Submitted on: Mar 3, 2025
Accepted on: Jul 21, 2025
Published on: Sep 9, 2025
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2025 Mostafa Abdou, Razia S. Sahi, Thomas D. Hull, Erik C. Nook, Nathaniel D. Daw, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.