LLMs and Coding in Qualitative Research: Advancements and Opportunities for Social Verbatim as an Integral Qualitative Tool
DEBATE: Beyond Big Data: Generative AI and LLMs as New Digital Technologies for the Analysis of Social Reality
DOI:
https://doi.org/10.54790/rccs.176Keywords:
Large Language Models (LLMs), qualitative coding, Generative Artificial Intelligence (GAI), qualitative research, open science, AI-assisted qualitative analysisAbstract
This article explores the use of Large Language Models (LLMs) in qualitative coding, highlighting advances and opportunities for the Social Verbatim tool. It reviews the fundamentals of LLMs, their architecture, and the impact of hardware on their development. Additionally, specific applications of LLMs in qualitative research are analyzed, including thematic coding and comparative analysis. Methodological, ethical, and epistemological challenges are addressed, and strategies to mitigate these issues are proposed. Finally, the implications of integrating LLMs into tools like Social Verbatim are discussed, emphasizing the importance of transparency and human-machine collaboration in qualitative research.
Downloads
Metrics
References
Arlinghaus, C. S., Wulff, C., Maier, G. W., Arlinghaus, C., Wulff, C. y Maier, G. (2024). Inductive coding with chatgpt-an evaluation of different gpt models clustering qualitative data into categories. OSF Preprints, doi, 10. DOI: https://doi.org/10.31219/osf.io/gpnye
Bail, C. A. (2023). Can generative AI improve social science? https://osf.io/rwtzs/download DOI: https://doi.org/10.31235/osf.io/rwtzs
Blair, E. (2015). A reflexive exploration of two qualitative data coding techniques. Journal of Methods and Measurement in the Social Sciences, 6(1), 14-29. DOI: https://doi.org/10.2458/v6i1.18772
Carius, A. C. y Teixeira, A. J. (2024). Artificial Intelligence and content analysis: the large language models (LLM) and the automatized categorization. AI y Soc., 40, 2405-2416. https://doi.org/10.1007/s00146-024-01988-y DOI: https://doi.org/10.1007/s00146-024-01988-y
Chew, R., Bollenbacher, J., Wenger, M., Speer, J. y Kim, A. (2023). LLM-assisted content analysis: Using large language models to support deductive coding. arXiv preprint arXiv:2306.14924.
Christou, P. A. (2023). How to use artificial intelligence (AI) as a resource, methodological and analysis tool in qualitative research? Qualitative Report, 28(7), 1968-1980. https://doi.org/10.46743)2160-3715/2023.6406C
Crabtree, B. y Miller, W. (Eds.) (1999). DoingQualitative Research (2nd ed.) [20 paragraphs]. Forum Qualitative Sozialforschung / Forum: Qualitative Social Research, 3(4), art. 3. http://nbn-resolving.de/urn:nbn:de:0114-fqs020432. © 2002 FQS. http://www.qualitative-research.net/fqs/
Dai, S. C., Xiong, A. y Ku, L. W. (2023). LLM-in-the-loop: Leveraging large language model for thematic analysis. arXiv preprint arXiv:2310.15100. DOI: https://doi.org/10.18653/v1/2023.findings-emnlp.669
Deterding, N. M. y Waters, M. C. (2021). Flexible coding of in-depth interviews: A twenty-first-century approach. Sociological Methods & Research, 50(2), 708-739. DOI: https://doi.org/10.1177/0049124118799377
Dunivin, Z. O. (2024). Scalable qualitative coding with LLM: Chain-of-thought reasoning matches human performance in some hermeneutic tasks. arXiv preprint arXiv:2401.15170.
Gao, J., Shu, Z. y Yeo, S. Y. (2025). Using Large Language Model to Support Flexible and Structural Inductive Qualitative Analysis. arXiv preprint arXiv:2501.00775.
Glasser, B. G. y Strauss, A. L. (1967). The development of grounded theory. Chicago, IL: Alden.
Gómez-Espino, J. M., Simó-Noguera, C. y Carvajal-Soria, P. (2025). Ciencia abierta y procesos de investigación cualitativa en la app Social Verbatim. Pendiente de publicación.
González-Veja, A. M. D. C., Sánchez, R. M., Salazar, A. L. y Salazar, G. L. L. (2022). La entrevista cualitativa como técnica de investigación en el estudio de las organizaciones. New trends in qualitative research, 14. DOI: https://doi.org/10.36367/ntqr.14.2022.e571
Hayes, A. S. (2025). «Conversing» with qualitative data: Enhancing qualitative research through large language models (LLM). International Journal of Qualitative Methods, 24, 16094069251322346. DOI: https://doi.org/10.1177/16094069251322346
Jiang, J. A., Wade, K., Fiesler, C. y Brubaker, J. R. (2021). Supporting serendipity: Opportunities and challenges for Human-AI Collaboration in qualitative analysis. Proceedings of the ACM on Human-Computer Interaction, 5(CSCW1), 1-23. DOI: https://doi.org/10.1145/3449168
Kissinger, H., Schmidt, E. y Huttenlocher, D. (2023), February 24. ChatGPT heralds an intellectual revolution. The Wall Street Journal, 24 de febrero. https://www.wsj.com/articles/chatgpt-heralds-an-intellectual-revolution
Lakshmanan, L. (2022). Why large language models like ChatGPT are bullshit artists, and how to use them effectively anyway. LinkedIn. https://www.linkedin.com/pulse/why-large-language-models-like-chatgpt-bullshit-how-use-lakshmanan/?trk=pulse-article_more-articles_related-content-card
Li, J., Li, J. y Su, Y (2024, Mayo). A map of exploring human interaction patterns with LLM: Insights into collaboration and creativity. En International conference on human-computer interaction (pp. 60-85). Springer Nature Switzerland. DOI: https://doi.org/10.1007/978-3-031-60615-1_5
Linneberg, S. M. y Korsgaard, S. (2019). Coding qualitative data: A synthesis guiding the novice. Qualitative Research Journal, 19(3), 259-270. DOI: https://doi.org/10.1108/QRJ-12-2018-0012
Marshall, D. T. y Naff, D. B. (2024). The ethics of using artificial intelligence in qualitative research. Journal of Empirical Research on Human Research Ethics, 19(3), 92-102. DOI: https://doi.org/10.1177/15562646241262659
Mathis, W. S., Zhao, S., Pratt, N., Weleff, J. y De Paoli, S. (2024). Inductive thematic analysis of healthcare qualitative interviews using open-source large language models: How does it compare to traditional methods? Computer Methods and Programs in Biomedicine, 255, 108356. DOI: https://doi.org/10.1016/j.cmpb.2024.108356
McMullin, C. (2023). Transcription and qualitative methods: Implications for third sector research. VOLUNTAS: International journal of voluntary and nonprofit organizations, 34(1), 140-153. DOI: https://doi.org/10.1007/s11266-021-00400-3
Meng, H., Yang, Y., Li, Y., Lee, J. y Lee, Y. C. (2024). Exploring the potential of human-LLM synergy in advancing qualitative analysis: A case study on mental-illness stigma. arXiv preprint arXiv:2405.05758. DOI: https://doi.org/10.1145/3778354
Miles, M. B. et al. (2015). Qualitative Data Analysis: A Methods Sourcebook and The Coding Manual for Qualitative Researchers: Matthew B. Miles, A. Michael Huberman, and Johnny Saldaña. Thousand Oaks, CA: SAGE.
Mitchell, M. (2024). Large Language Models. En M. C. Frank y A. Majid (Eds.), Open Encyclopedia of Cognitive Science. MIT Press. https://doi.org/10.21428/e2759450.2bb20e3c DOI: https://doi.org/10.21428/e2759450.2bb20e3c
Mitchell, M. y Krakauer, D. C. (2023). The debate over understanding in AI’s large language models. Proceedings of the National Academy of Sciences, 120(13), e2215907120. DOI: https://doi.org/10.1073/pnas.2215907120
Molina, M. y Garip, F. (2019). Machine learning for sociology. Annual Review of Sociology, 45(1), 27-45. DOI: https://doi.org/10.1146/annurev-soc-073117-041106
Morgan, D. L. (2023). Exploring the Use of Artificial Intelligence for Qualitative Data Analysis: The Case of ChatGPT. International Journal of Qualitative Methods, 22. https://doi.org/10.1177/16094069231211248 DOI: https://doi.org/10.1177/16094069231211248
Qiao, T., Walker, C., Cunningham, C. W. y Koh, Y. S. (2025). Thematic-LM: a LLM-based Multi-agent System for Large-scale Thematic Analysis. En The Web Conference 2025. DOI: https://doi.org/10.1145/3696410.3714595
Rossi, L., Harrison, K. y Shklovski, I. (2024). The Problems of LLM-generated Data in Social Science Research. Sociologica, 18(2), 145-168.
Schroeder, H., Quéré, M. A. L., Randazzo, C., Mimno, D. y Schoenebeck, S. (2025). Large Language Models in Qualitative Research: Uses, Tensions, and Intentions. Computer Science. abril-mayo. https://arxiv.org/abs/2410.07362 DOI: https://doi.org/10.1145/3706598.3713120
Social Verbatim (s. f.). Social Verbatim. https://www.socialverbatim.com. www.app.socialverbatim.com.
Spirling, A. (2023). World view. Nature, 616, 413. DOI: https://doi.org/10.1038/d41586-023-01295-4
Tai, R. H., Bentley, L. R., Xia, X., Sitt, J. M., Fankhauser, S. C., Chicas-Mosier, A. M. y Monteith, B. G. (2024). An examination of the use of large language models to aid analysis of textual data. International Journal of Qualitative Methods, 23, 16094069241231168. DOI: https://doi.org/10.1177/16094069241231168
Törnberg, P. (2023). How to use LLMs for text analysis. arXiv [Preprint] (2023). https://doi.org/10.48550/arXiv.2307.13106
Van Dis, E. A., Bollen, J., Zuidema, W., Van Rooij, R. y Bockting, C. L. (2023). ChatGPT: five priorities for research. Nature, 614(7947), 224-226. DOI: https://doi.org/10.1038/d41586-023-00288-7
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł. y Polosukhin, I. (2017). Attention Is All You Need. En Advances in Neural Information Processing Systems (vol. 30). https://arxiv.org/abs/1706.03762
Wang, Y., Wang, Q., Shi, S., He, X., Tang, Z., Zhao, K. y Chu, X. (2019). Benchmarking the performance and energy efficiency of AI accelerators for AI training. Proceedings of the 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA). https://doi.org/10.1109/HPCA.2019.00015 DOI: https://doi.org/10.1109/HPCA.2019.00015
Wu, Y., Nagler, J., Tucker, J. A. y Messing, S. (2023). Large language models can be used to scale the ideologies of politicians in a zero-shot learning setting. arXiv [Preprint] https://doi.org/10.48550/arXiv.2303.12057.
Xiao, Z., Yuan, X., Liao, Q. V., Abdelghani, R. y Oudeyer, P. Y. (2023, March). Supporting qualitative analysis with large language models: Combining codebook with GPT-3 for deductive coding. En Companion proceedings of the 28th international conference on intelligent user interfaces (pp. 75-78). DOI: https://doi.org/10.1145/3581754.3584136
Yang, Y. y Ma, L. (2025). Artificial intelligence in qualitative analysis: a practical guide and reflections based on results from using GPT to analyze interview data in a substance use program. Quality & Quantity, 1-24. DOI: https://doi.org/10.1007/s11135-025-02066-1
Zhang, H., Wu, C., Xie, J., Rubino, F., Graver, S., Kim, C., ... y Cai, J. (2024). When Qualitative Research Meets Large Language Model: Exploring the Potential of QualiGPT as a Tool for Qualitative Coding. arXiv preprint arXiv:2407.14925.
Zhao, F., Yu, F. y Shang, Y. (2024). A New Method Supporting Qualitative Data Analysis Through Prompt Generation for Inductive Coding. En 2024 IEEE International Conference on Information Reuse and Integration for Data Science (IRI) (pp. 164-169). IEEE. DOI: https://doi.org/10.1109/IRI62200.2024.00043
Ziems, C., Held, W., Shaikh, O., Chen, J., Zhang, Z. y Yang, D. (2024). Can large language models transform computational social science? Computational Linguistics, 50(1), 237-291. https://doi.org/10.48550/arXiv.2305.03514 DOI: https://doi.org/10.1162/coli_a_00502
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Juan Miguel Gómez Espino

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.







