Contribution Summary
The paper offers a practical guide to using LLM and NLP workflows for scalable education research while documenting where human verification remains necessary.
Draft enrichment generated from extracted publication text; pending human review.
Plain-Language Summary
This paper demonstrates how embeddings, clustering, summarization, and prompting can be used to analyze large collections of student writing in engineering education research. It uses more than 1,000 student career-interest essays to show both inductive codebook generation and deductive labeling workflows.
Research Question
How can NLP and large language model workflows support scalable thematic analysis of unstructured text data in engineering education research?
Methods
- Analyzed 1,014 undergraduate engineering student career-interest essays.
- Parsed essays into sentences, embedded the text, clustered semantically similar statements, and generated summaries and labels.
- Tested deductive labeling workflows using O*NET SOC job titles and career satisfaction factors, including model self-evaluation of label accuracy.
Key Findings
- The inductive workflow produced preliminary thematic structure from 9,105 essay sentences and 285 semantic clusters.
- Deductive labeling reached 93% agreement with human raters for O*NET SOC accuracy judgments and 86% for career satisfaction factors.
- A small but important share of labels remained questionable or in disagreement with human judgment, reinforcing the need for human-in-the-loop validation.
Implications
LLM-assisted methods can help engineering education researchers analyze student writing at a scale that is difficult to manage manually.
Model self-evaluation can help prioritize human review, but it should not be treated as independent validation.
Researchers need to account for model bias, controllability, environmental costs, data type limitations, and domain-specific validation.
Research Artifacts
Abstract
Publication on The Utility of Large Language Models and Generative AI for Education Research
Related Projects
Using Large Language Models and Generative AI to Scale Qualitative Data Analysis
How can researchers combine qualitative judgment with open-source generative AI to scale thematic analysis without hiding methodological choices?
CAREER: Minds and Machines: Exploring Engineering Faculty Member Mental Models of Generative AI and Instructional Decisions
How do engineering faculty understand generative AI, and how do those mental models shape instructional decisions?
Structures and Machines: An Interdisciplinary Approach to Mapping the Policy Implications of Generative AI in Higher Education
Analyzing existing policies related to AI in higher education across the US