Abdi, H. (2003). Partial least square regression (PLS regression). Encyclopedia for Research Methods for the Social Sciences, 6(4), 792–795.
Adair, J. G. (1984). The hawthorne effect: A reconsideration of the methodological artifact. Journal of Applied Psychology, 69(2), 334.
Allaire, J. J., Teague, C., Xie, Y., & Dervieux, C. (2022).
Quarto.
https://doi.org/10.5281/ZENODO.5960048
Atari, M., Omrani, A., & Dehghani, M. (2023). Contextualized construct representation: Leveraging psychometric scales to advance theory-driven text analysis. Preprint at PsyArXiv Https://Doi.org/10.31234/Osf. Io/M93pd.
Beck, A. T., Steer, R. A., & Brown, G. (1996). Beck depression inventory–II. Psychological Assessment.
Bilgrami, Z. R., Sarac, C., Srivastava, A., Herrera, S. N., Azis, M., Haas, S. S., Shaik, R. B., Parvaz, M. A., Mittal, V. A., Cecchi, G., et al. (2022). Construct validity for computational linguistic metrics in individuals at clinical risk for psychosis: Associations with clinical ratings. Schizophrenia Research, 245, 90–96.
Bolukbasi, T., Chang, K.-W., Zou, J. Y., Saligrama, V., & Kalai, A. T. (2016). Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. Advances in Neural Information Processing Systems, 29.
Borsboom, D., Mellenbergh, G. J., & Heerden, J. van. (2004). The Concept of Validity.
Psychological Review,
111(4), 1061–1071.
https://doi.org/10.1037/0033-295x.111.4.1061
Chandler, C., Foltz, P. W., & Elvevåg, B. (2020). Using machine learning in psychiatry: The need to establish a framework that nurtures trustworthiness. Schizophrenia Bulletin, 46(1), 11–14.
Chen, Y., Li, S., Li, Y., & Atari, M. (2024). Surveying the dead minds: Historical-psychological text analysis with contextualized construct representation (CCR) for classical chinese. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2597–2615.
Cohen, A. S. (2019). Advancing ambulatory biobehavioral technologies beyond “proof of concept”: Introduction to the special section. Psychological Assessment, 31(3), 277.
Cohen, A. S., Rodriguez, Z., Warren, K. K., Cowan, T., Masucci, M. D., Edvard Granrud, O., Holmlund, T. B., Chandler, C., Foltz, P. W., & Strauss, G. P. (2022). Natural language processing and psychosis: On the need for comprehensive psychometric evaluation. Schizophrenia Bulletin, 48(5), 939–948.
Crestani, F., Losada, D. E., & Parapar, J. (2022). Early detection of mental health disorders by social media monitoring. Studies in Computational Intelligence, 1018(4).
Cronbach, L. J. (1949). Essentials of psychological testing.
Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests.
Psychological Bulletin,
52(4), 281–302.
https://doi.org/10.1037/h0040957
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). Bert: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 4171–4186.
Dohnány, S., Kurth-Nelson, Z., Spens, E., Luettgau, L., Reid, A., Gabriel, I., Summerfield, C., Shanahan, M., & Nour, M. M. (2026). Technological folie à deux: Feedback loops between AI chatbots and mental health. Nature Mental Health, 1–10.
Eberhardt, S. T., Vehlen, A., Schaffrath, J., Schwartz, B., Baur, T., Schiller, D., Hallmen, T., André, E., & Lutz, W. (2025). Development and validation of large language model rating scales for automatically transcribed psychological therapy sessions. Scientific Reports, 15(1), 29541.
First, M. B. (2014). Structured clinical interview for the DSM (SCID). The Encyclopedia of Clinical Psychology, 1–6.
Firth, J. (1957). A synopsis of linguistic theory, 1930-1955. Studies in Linguistic Analysis, 10–32.
Flake, J. K., Pek, J., & Hehman, E. (2017). Construct validation in social and personality research: Current practice and recommendations. Social Psychological and Personality Science, 8(4), 370–378.
Furr, R. M. (2021). Psychometrics: An introduction. SAGE publications.
Giorgi, S., Lynn, V. E., Gupta, K., Ahmed, F., Matz, S., Ungar, L. H., & Schwartz, H. A. (2022). Correcting sociodemographic selection biases for population prediction from social media. Proceedings of the International AAAI Conference on Web and Social Media, 16, 228–240.
Grand, G., Blank, I. A., Pereira, F., & Fedorenko, E. (2022). Semantic projection recovers rich human knowledge of multiple object features from word embeddings. Nature Human Behaviour, 6(7), 975–987.
Grimm, K. J., & Widaman, K. F. (2012). Construct validity.
Gu, Z., Kjell, K., Schwartz, H. A., & Kjell, O. (2025). Natural language response formats for assessing depression and worry with large language models: A sequential evaluation with model pre-registration. Assessment, 10731911251364022.
Harris, Z. S. (1954). Distributional structure. Word, 10(2-3), 146–162.
Hupkes, D., Giulianelli, M., Dankers, V., Artetxe, M., Elazar, Y., Pimentel, T., Christodoulopoulos, C., Lasri, K., Saphra, N., Sinclair, A., et al. (2023). A taxonomy and review of generalization research in NLP. Nature Machine Intelligence, 5(10), 1161–1174.
Hüppi, R. M., Bautista, L., Cecere, G., Just, S. A., Koops, S., Hussain, M., Tedeschi, E., Benke-Bruderer, S., Bora, E., Lyne, J., et al. (2025). TRUSTING: An international multicenter observational study of speech-based relapse prediction in psychosis using explainable AI. medRxiv, 2025–2011.
Kjell, O. N. E., Kjell, K., Garcia, D., & Sikström, S. (2019). Semantic measures: Using natural language processing to measure, differentiate, and describe psychological constructs.
Psychological Methods,
24(1), 92–115.
https://doi.org/10.1037/met0000191
Kjell, O. N., Kjell, K., Garcia, D., & Sikström, S. (2019). Semantic measures: Using natural language processing to measure, differentiate, and describe psychological constructs. Psychological Methods, 24(1), 92.
Kjell, O. N., Kjell, K., & Schwartz, H. A. (2024). Beyond rating scales: With targeted evaluation, large language models are poised for psychological assessment. Psychiatry Research, 333, 115667.
Kjell, O. N., Sikström, S., Kjell, K., & Schwartz, H. A. (2022). Natural language analyzed with AI-based transformers predict traditional subjective well-being measures approaching the theoretical upper limits in accuracy. Scientific Reports, 12(1), 3918.
Kjell, O., Daukantaitė, D., & Sikström, S. (2021). Computational language assessments of harmony in life—not satisfaction with life or rating scales—correlate with cooperative behaviors. Frontiers in Psychology, 12, 601679.
Kjell, O., Ganesan, A. V., Boyd, R. L., Oltmanns, J., Rivero, A., Feltman, S., Carr, M. A., Alves, J., Luft, B., Kotov, R., et al. (2026). Replicability and validity of a new artificial-intelligence assessment of posttraumatic stress disorder from patient language: A sequential evaluation with model preregistration. Clinical Psychological Science, 21677026261439026.
Kjell, O., Giorgi, S., & Schwartz, H. A. (2023). The text-package: An r-package for analyzing and visualizing human language using natural language processing and transformers. Psychological Methods, 28(6), 1478.
Kroenke, K., Spitzer, R. L., & Williams, J. B. (2001). The PHQ-9: Validity of a brief depression severity measure. Journal of General Internal Medicine, 16(9), 606–613.
Kühner, C., Bürger, C., Keller, F., & Hautzinger, M. (2007). Reliability and validity of the revised beck depression inventory (BDI-II). Results from german samples. Der Nervenarzt, 78(6), 651–656.
Lake, B. M., & Murphy, G. L. (2023). Word meaning in minds and machines. Psychological Review, 130(2), 401.
Lee, J.-J., Han, J., & Woo, C.-W. (2026). Interpretable depression assessment using a large language model. PLOS Digital Health, 5(2), e0001205.
Lee, S., Shakir, A., Koenig, D., & Lipp, J. (2024).
Open source gets DE-licious: Mixedbread x deepset german/english embeddings.
https://www.mixedbread.ai/blog/deepset-mxbai-embed-de-large-v1
Lovibond, P. F., & Lovibond, S. H. (1995). The structure of negative emotional states: Comparison of the depression anxiety stress scales (DASS) with the beck depression and anxiety inventories. Behaviour Research and Therapy, 33(3), 335–343.
Mikolov, T., Yih, W., & Zweig, G. (2013). Linguistic regularities in continuous space word representations. Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 746–751.
Nilges, P., & Essau, C. (2015). Die depressions-angst-stress-skalen. Der Schmerz, 29(6), 649–657.
Nilsson, A. H., Eijsbroek, V. C., Gu, Z., Kjell, K., Giorgi, S., Kotov, R., Ganesan, A. V., Schwartz, H. A., & Kjell, O. N. (2026). The language-based assessment model library: Open model sharing for independent validation and broader applications. Advances in Methods and Practices in Psychological Science, 9(2), 25152459261419036.
Nolen-Hoeksema, S. (2001). Gender differences in depression. Current Directions in Psychological Science, 10(5), 173–176.
Palan, S., & Schitter, C. (2018). Prolific. Ac—a subject pool for online experiments. Journal of Behavioral and Experimental Finance, 17, 22–27.
Pan, S. J., & Yang, Q. (2009). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10), 1345–1359.
Parapar, J., Martı́n-Rodilla, P., Losada, D. E., & Crestani, F. (2021). eRisk 2021: Pathological gambling, self-harm and depression challenges. European Conference on Information Retrieval, 650–656.
Parapar, J., Perez, A., Wang, X., & Crestani, F. (2025). eRisk 2025: Contextual and conversational approaches for depression challenges. European Conference on Information Retrieval, 416–424.
Piantadosi, S. T., Muller, D. C., Rule, J. S., Kaushik, K., Gorenstein, M., Leib, E. R., & Sanford, E. (2024). Why concepts are (probably) vectors. Trends in Cognitive Sciences, 28(9), 844–856.
Plank, L., & Zlomuzica, A. (2024a). Natural language processing reveals differences in mental time travel at higher levels of self-efficacy. Scientific Reports, 14(1), 25342.
Plank, L., & Zlomuzica, A. (2024b). Reduced speech coherence in psychosis-related social media forum posts. Schizophrenia, 10(1), 60.
Plank, L., & Zlomuzica, A. (2025). Detecting psychosis via natural language processing of social media posts: Potentials and pitfalls. Neuropsychologia, 109325.
Sap, M., Park, G., Eichstaedt, J., Kern, M., Stillwell, D., Kosinski, M., Ungar, L., & Schwartz, H. A. (2014). Developing age and gender predictive lexica over social media. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1146–1151.
Sharkey, L., Chughtai, B., Batson, J., Lindsey, J., Wu, J., Bushnaq, L., Goldowsky-Dill, N., Heimersheim, S., Ortega, A., Bloom, J., et al. (2025). Open problems in mechanistic interpretability. arXiv Preprint arXiv:2501.16496.
Shiffman, S., Stone, A. A., & Hufford, M. R. (2008). Ecological momentary assessment. Annu. Rev. Clin. Psychol., 4(1), 1–32.
Spitzer, R. L., Kroenke, K., Williams, J. B., & Löwe, B. (2006). A brief measure for assessing generalized anxiety disorder: The GAD-7. Archives of Internal Medicine, 166(10), 1092–1097.
Stade, E. C., Ungar, L., Eichstaedt, J. C., Sherman, G., & Ruscio, A. M. (2023). Depression and anxiety have distinct and overlapping language patterns: Results from a clinical interview. Journal of Psychopathology and Clinical Science, 132(8), 972.
Steen, E., Yurechko, K., & Klug, D. (2023). You can (not) say what you want: Using algospeak to contest and evade algorithmic content moderation on TikTok. Social Media+ Society, 9(3), 20563051231194586.
Steiger, J. H. (1980). Tests for comparing elements of a correlation matrix. Psychological Bulletin, 87(2), 245.
Tausczik, Y. R., & Pennebaker, J. W. (2010). The psychological meaning of words: LIWC and computerized text analysis methods. Journal of Language and Social Psychology, 29(1), 24–54.
Team, R. C. et al. (2016). R: A language and environment for statistical computing. R foundation for statistical computing, vienna, austria. Http://Www. R-Project. Org/.
Teitelbaum, L., & Simchon, A. (2025). Neural text embeddings in psychological research: A guide with examples in r. Psychological Methods.
Van Rossum, G., & Drake Jr, F. L. (1995). Python tutorial (Vol. 620). Centrum voor Wiskunde en Informatica Amsterdam, The Netherlands.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30.
Wahl, I., Löwe, B., Bjorner, J. B., Fischer, F., Langs, G., Voderholzer, U., Aita, S. A., Bergemann, N., Brähler, E., & Rose, M. (2014). Standardization of depression measurement: A common metric was developed for 11 self-report depression measures. Journal of Clinical Epidemiology, 67(1), 73–86.
Wang, L., Yang, N., Huang, X., Yang, L., Majumder, R., & Wei, F. (2024). Multilingual e5 text embeddings: A technical report. arXiv Preprint arXiv:2402.05672.
Wright, A. G., Ringwald, W. R., Vize, C. E., Eichstaedt, J. C., Angstadt, M., Taxali, A., & Sripada, C. (2026). Assessing personality using zero-shot generative AI scoring of brief open-ended text. Nature Human Behaviour, 1–15.