Los LLM y la codificación en la investigación cualitativa: avances y oportunidades para Social Verbatim como herramienta integral cualitativa: DEBATE:  Más allá del Big Data: IA Generativa y LLMs como nuevas tecnologías digitales para el análisis de la realidad social

Juan Miguel Gómez Espino

doi:10.54790/rccs.176

Authors

Juan Miguel Gómez Espino Universidad Pablo de Olavide https://orcid.org/0000-0002-1646-186X

DOI:

https://doi.org/10.54790/rccs.176

Keywords:

Large Language Models (LLMs), qualitative coding, Generative Artificial Intelligence (GAI), qualitative research, open science, AI-assisted qualitative analysis

Abstract

This article explores the use of Large Language Models (LLMs) in qualitative coding, highlighting advances and opportunities for the Social Verbatim tool. It reviews the fundamentals of LLMs, their architecture, and the impact of hardware on their development. Additionally, specific applications of LLMs in qualitative research are analyzed, including thematic coding and comparative analysis. Methodological, ethical, and epistemological challenges are addressed, and strategies to mitigate these issues are proposed. Finally, the implications of integrating LLMs into tools like Social Verbatim are discussed, emphasizing the importance of transparency and human-machine collaboration in qualitative research.

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

Author Biography

Juan Miguel Gómez Espino, Universidad Pablo de Olavide

Profesor titular de Sociología de la Universidad Pablo de Olavide (Sevilla). Desde 2001 ha compaginado la docencia y la investigación en sociología de la infancia y la educación con labores de gestión universitaria (actualmente como decano de la Facultad de Ciencias Sociales). Licenciado en Ciencias Políticas y Sociología por la Universidad de Granada, y doctor desde 2009, ha escrito varias publicaciones en revistas de impacto como Revista Española de Investigaciones Sociológicas, Empiria, Childhood, o International Sociology o Language and education o capítulos libros en editoriales como Springer. Además, ha sido profesor invitado en la Universidad de Sheffield (Reino Unido) y Gramma (Cuba).

References

Arlinghaus, C. S., Wulff, C., Maier, G. W., Arlinghaus, C., Wulff, C. y Maier, G. (2024). Inductive coding with chatgpt-an evaluation of different gpt models clustering qualitative data into categories. OSF Preprints, doi, 10. DOI: https://doi.org/10.31219/osf.io/gpnye

Bail, C. A. (2023). Can generative AI improve social science? https://osf.io/rwtzs/download DOI: https://doi.org/10.31235/osf.io/rwtzs

Blair, E. (2015). A reflexive exploration of two qualitative data coding techniques. Journal of Methods and Measurement in the Social Sciences, 6(1), 14-29. DOI: https://doi.org/10.2458/v6i1.18772

Carius, A. C. y Teixeira, A. J. (2024). Artificial Intelligence and content analysis: the large language models (LLM) and the automatized categorization. AI y Soc., 40, 2405-2416. https://doi.org/10.1007/s00146-024-01988-y DOI: https://doi.org/10.1007/s00146-024-01988-y

Chew, R., Bollenbacher, J., Wenger, M., Speer, J. y Kim, A. (2023). LLM-assisted content analysis: Using large language models to support deductive coding. arXiv preprint arXiv:2306.14924.

Christou, P. A. (2023). How to use artificial intelligence (AI) as a resource, methodological and analysis tool in qualitative research? Qualitative Report, 28(7), 1968-1980. https://doi.org/10.46743)2160-3715/2023.6406C

Crabtree, B. y Miller, W. (Eds.) (1999). DoingQualitative Research (2nd ed.) [20 paragraphs]. Forum Qualitative Sozialforschung / Forum: Qualitative Social Research, 3(4), art. 3. http://nbn-resolving.de/urn:nbn:de:0114-fqs020432. © 2002 FQS. http://www.qualitative-research.net/fqs/

Dai, S. C., Xiong, A. y Ku, L. W. (2023). LLM-in-the-loop: Leveraging large language model for thematic analysis. arXiv preprint arXiv:2310.15100. DOI: https://doi.org/10.18653/v1/2023.findings-emnlp.669

Deterding, N. M. y Waters, M. C. (2021). Flexible coding of in-depth interviews: A twenty-first-century approach. Sociological Methods & Research, 50(2), 708-739. DOI: https://doi.org/10.1177/0049124118799377

Dunivin, Z. O. (2024). Scalable qualitative coding with LLM: Chain-of-thought reasoning matches human performance in some hermeneutic tasks. arXiv preprint arXiv:2401.15170.

Gao, J., Shu, Z. y Yeo, S. Y. (2025). Using Large Language Model to Support Flexible and Structural Inductive Qualitative Analysis. arXiv preprint arXiv:2501.00775.

Glasser, B. G. y Strauss, A. L. (1967). The development of grounded theory. Chicago, IL: Alden.

Gómez-Espino, J. M., Simó-Noguera, C. y Carvajal-Soria, P. (2025). Ciencia abierta y procesos de investigación cualitativa en la app Social Verbatim. Pendiente de publicación.

González-Veja, A. M. D. C., Sánchez, R. M., Salazar, A. L. y Salazar, G. L. L. (2022). La entrevista cualitativa como técnica de investigación en el estudio de las organizaciones. New trends in qualitative research, 14. DOI: https://doi.org/10.36367/ntqr.14.2022.e571

Hayes, A. S. (2025). «Conversing» with qualitative data: Enhancing qualitative research through large language models (LLM). International Journal of Qualitative Methods, 24, 16094069251322346. DOI: https://doi.org/10.1177/16094069251322346

Jiang, J. A., Wade, K., Fiesler, C. y Brubaker, J. R. (2021). Supporting serendipity: Opportunities and challenges for Human-AI Collaboration in qualitative analysis. Proceedings of the ACM on Human-Computer Interaction, 5(CSCW1), 1-23. DOI: https://doi.org/10.1145/3449168

Kissinger, H., Schmidt, E. y Huttenlocher, D. (2023), February 24. ChatGPT heralds an intellectual revolution. The Wall Street Journal, 24 de febrero. https://www.wsj.com/articles/chatgpt-heralds-an-intellectual-revolution

Lakshmanan, L. (2022). Why large language models like ChatGPT are bullshit artists, and how to use them effectively anyway. LinkedIn. https://www.linkedin.com/pulse/why-large-language-models-like-chatgpt-bullshit-how-use-lakshmanan/?trk=pulse-article_more-articles_related-content-card

Li, J., Li, J. y Su, Y (2024, Mayo). A map of exploring human interaction patterns with LLM: Insights into collaboration and creativity. En International conference on human-computer interaction (pp. 60-85). Springer Nature Switzerland. DOI: https://doi.org/10.1007/978-3-031-60615-1_5

Linneberg, S. M. y Korsgaard, S. (2019). Coding qualitative data: A synthesis guiding the novice. Qualitative Research Journal, 19(3), 259-270. DOI: https://doi.org/10.1108/QRJ-12-2018-0012

Marshall, D. T. y Naff, D. B. (2024). The ethics of using artificial intelligence in qualitative research. Journal of Empirical Research on Human Research Ethics, 19(3), 92-102. DOI: https://doi.org/10.1177/15562646241262659

Mathis, W. S., Zhao, S., Pratt, N., Weleff, J. y De Paoli, S. (2024). Inductive thematic analysis of healthcare qualitative interviews using open-source large language models: How does it compare to traditional methods? Computer Methods and Programs in Biomedicine, 255, 108356. DOI: https://doi.org/10.1016/j.cmpb.2024.108356

McMullin, C. (2023). Transcription and qualitative methods: Implications for third sector research. VOLUNTAS: International journal of voluntary and nonprofit organizations, 34(1), 140-153. DOI: https://doi.org/10.1007/s11266-021-00400-3

Meng, H., Yang, Y., Li, Y., Lee, J. y Lee, Y. C. (2024). Exploring the potential of human-LLM synergy in advancing qualitative analysis: A case study on mental-illness stigma. arXiv preprint arXiv:2405.05758. DOI: https://doi.org/10.1145/3778354

Miles, M. B. et al. (2015). Qualitative Data Analysis: A Methods Sourcebook and The Coding Manual for Qualitative Researchers: Matthew B. Miles, A. Michael Huberman, and Johnny Saldaña. Thousand Oaks, CA: SAGE.

Mitchell, M. (2024). Large Language Models. En M. C. Frank y A. Majid (Eds.), Open Encyclopedia of Cognitive Science. MIT Press. https://doi.org/10.21428/e2759450.2bb20e3c DOI: https://doi.org/10.21428/e2759450.2bb20e3c

Mitchell, M. y Krakauer, D. C. (2023). The debate over understanding in AI’s large language models. Proceedings of the National Academy of Sciences, 120(13), e2215907120. DOI: https://doi.org/10.1073/pnas.2215907120

Molina, M. y Garip, F. (2019). Machine learning for sociology. Annual Review of Sociology, 45(1), 27-45. DOI: https://doi.org/10.1146/annurev-soc-073117-041106

Morgan, D. L. (2023). Exploring the Use of Artificial Intelligence for Qualitative Data Analysis: The Case of ChatGPT. International Journal of Qualitative Methods, 22. https://doi.org/10.1177/16094069231211248 DOI: https://doi.org/10.1177/16094069231211248

Qiao, T., Walker, C., Cunningham, C. W. y Koh, Y. S. (2025). Thematic-LM: a LLM-based Multi-agent System for Large-scale Thematic Analysis. En The Web Conference 2025. DOI: https://doi.org/10.1145/3696410.3714595

Rossi, L., Harrison, K. y Shklovski, I. (2024). The Problems of LLM-generated Data in Social Science Research. Sociologica, 18(2), 145-168.

Schroeder, H., Quéré, M. A. L., Randazzo, C., Mimno, D. y Schoenebeck, S. (2025). Large Language Models in Qualitative Research: Uses, Tensions, and Intentions. Computer Science. abril-mayo. https://arxiv.org/abs/2410.07362 DOI: https://doi.org/10.1145/3706598.3713120

Social Verbatim (s. f.). Social Verbatim. https://www.socialverbatim.com. www.app.socialverbatim.com.

Spirling, A. (2023). World view. Nature, 616, 413. DOI: https://doi.org/10.1038/d41586-023-01295-4

Tai, R. H., Bentley, L. R., Xia, X., Sitt, J. M., Fankhauser, S. C., Chicas-Mosier, A. M. y Monteith, B. G. (2024). An examination of the use of large language models to aid analysis of textual data. International Journal of Qualitative Methods, 23, 16094069241231168. DOI: https://doi.org/10.1177/16094069241231168

Törnberg, P. (2023). How to use LLMs for text analysis. arXiv [Preprint] (2023). https://doi.org/10.48550/arXiv.2307.13106

Van Dis, E. A., Bollen, J., Zuidema, W., Van Rooij, R. y Bockting, C. L. (2023). ChatGPT: five priorities for research. Nature, 614(7947), 224-226. DOI: https://doi.org/10.1038/d41586-023-00288-7

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł. y Polosukhin, I. (2017). Attention Is All You Need. En Advances in Neural Information Processing Systems (vol. 30). https://arxiv.org/abs/1706.03762

Wang, Y., Wang, Q., Shi, S., He, X., Tang, Z., Zhao, K. y Chu, X. (2019). Benchmarking the performance and energy efficiency of AI accelerators for AI training. Proceedings of the 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA). https://doi.org/10.1109/HPCA.2019.00015 DOI: https://doi.org/10.1109/HPCA.2019.00015

Wu, Y., Nagler, J., Tucker, J. A. y Messing, S. (2023). Large language models can be used to scale the ideologies of politicians in a zero-shot learning setting. arXiv [Preprint] https://doi.org/10.48550/arXiv.2303.12057.

Xiao, Z., Yuan, X., Liao, Q. V., Abdelghani, R. y Oudeyer, P. Y. (2023, March). Supporting qualitative analysis with large language models: Combining codebook with GPT-3 for deductive coding. En Companion proceedings of the 28th international conference on intelligent user interfaces (pp. 75-78). DOI: https://doi.org/10.1145/3581754.3584136

Yang, Y. y Ma, L. (2025). Artificial intelligence in qualitative analysis: a practical guide and reflections based on results from using GPT to analyze interview data in a substance use program. Quality & Quantity, 1-24. DOI: https://doi.org/10.1007/s11135-025-02066-1

Zhang, H., Wu, C., Xie, J., Rubino, F., Graver, S., Kim, C., ... y Cai, J. (2024). When Qualitative Research Meets Large Language Model: Exploring the Potential of QualiGPT as a Tool for Qualitative Coding. arXiv preprint arXiv:2407.14925.

Zhao, F., Yu, F. y Shang, Y. (2024). A New Method Supporting Qualitative Data Analysis Through Prompt Generation for Inductive Coding. En 2024 IEEE International Conference on Information Reuse and Integration for Data Science (IRI) (pp. 164-169). IEEE. DOI: https://doi.org/10.1109/IRI62200.2024.00043

Ziems, C., Held, W., Shaikh, O., Chen, J., Zhang, Z. y Yang, D. (2024). Can large language models transform computational social science? Computational Linguistics, 50(1), 237-291. https://doi.org/10.48550/arXiv.2305.03514 DOI: https://doi.org/10.1162/coli_a_00502

LLMs and Coding in Qualitative Research: Advancements and Opportunities for Social Verbatim as an Integral Qualitative Tool

DEBATE: Beyond Big Data: Generative AI and LLMs as New Digital Technologies for the Analysis of Social Reality

Authors

DOI:

Keywords:

Abstract

Downloads

Metrics

Author Biography

Juan Miguel Gómez Espino, Universidad Pablo de Olavide

References

Downloads

Published

How to Cite

Issue

Section

License

Language

Information

Indexing

Anti-plagiarism tool

Keywords

Video_presentacion