Journal Article

Leveraging Generative Text Models and Natural Language Processing to Perform Traditional Thematic Data Analysis

Isil Anakok, Andrew Katz, Kai Jun Chew, Holly Matusovich

International Journal of Qualitative Methods2025Featured

Contribution Summary

The paper provides a practical roadmap for researchers who want to integrate NLP and generative AI into thematic analysis while retaining qualitative rigor.

Draft enrichment generated from extracted publication text; pending human review.

Plain-Language Summary

This paper translates the phases of traditional thematic analysis into a generative AI-assisted workflow. It uses a case study of engineering faculty responses to generative AI and assessment to show how NLP and generative text models can support, but not replace, qualitative researchers.

Research Question

How can common steps in thematic analysis be performed using generative AI and NLP, and what advantages and limitations emerge in a case study?

Methods

  • Mapped Braun and Clarke's phases of thematic analysis into a Generative AI-Assisted Thematic Analysis workflow.
  • Used summaries, embeddings, dimensionality reduction, clustering, initial code generation, theme generation, and cosine-similarity checks on faculty response data.
  • Maintained a human-in-the-loop process in which researchers reviewed prompts, model outputs, codes, themes, and limitations.

Key Findings

  • The workflow can streamline familiarization, coding, theme development, and theme review for large qualitative datasets.
  • The case study shows practical advantages in time, labor, and systematic coverage, while preserving the need for researcher interpretation.
  • The paper identifies risks around model replicability, bias, hardware access, prompt sensitivity, and loss of qualitative nuance.

Implications

AI-assisted thematic analysis should be designed as a documented research workflow with explicit points for human judgment.

Prompt design, model settings, and versioning should be treated as methodological decisions, not implementation details.

Compute costs and model access can create inequities in who can use advanced AI-assisted qualitative methods.

Research Artifacts

protocolGATA workflowA phase-by-phase workflow for using NLP and generative AI to support traditional thematic analysis.
softwareGATA GitHub repositoryRepository associated with the generative AI-assisted thematic analysis workflow.

Abstract

This paper explores how generative text models and natural language processing can be leveraged to perform traditional thematic data analysis in educational research contexts.

Related Projects