The Utility of Large Language Models and Generative AI for Education Research

Contribution Summary

The paper offers a practical guide to using LLM and NLP workflows for scalable education research while documenting where human verification remains necessary.

Draft enrichment generated from extracted publication text; pending human review.

Plain-Language Summary

This paper demonstrates how embeddings, clustering, summarization, and prompting can be used to analyze large collections of student writing in engineering education research. It uses more than 1,000 student career-interest essays to show both inductive codebook generation and deductive labeling workflows.

Research Question

How can NLP and large language model workflows support scalable thematic analysis of unstructured text data in engineering education research?

Methods

Analyzed 1,014 undergraduate engineering student career-interest essays.
Parsed essays into sentences, embedded the text, clustered semantically similar statements, and generated summaries and labels.
Tested deductive labeling workflows using O*NET SOC job titles and career satisfaction factors, including model self-evaluation of label accuracy.

Key Findings

The inductive workflow produced preliminary thematic structure from 9,105 essay sentences and 285 semantic clusters.
Deductive labeling reached 93% agreement with human raters for O*NET SOC accuracy judgments and 86% for career satisfaction factors.
A small but important share of labels remained questionable or in disagreement with human judgment, reinforcing the need for human-in-the-loop validation.

Implications

LLM-assisted methods can help engineering education researchers analyze student writing at a scale that is difficult to manage manually.

Model self-evaluation can help prioritize human review, but it should not be treated as independent validation.

Researchers need to account for model bias, controllability, environmental costs, data type limitations, and domain-specific validation.

Research Artifacts

protocolInductive embedding and clustering workflowA workflow for clustering student essay sentences and generating thematic summaries.

appendixDeductive labeling promptsAppendix prompts for O*NET SOC and career satisfaction factor labeling.

Abstract

Publication on The Utility of Large Language Models and Generative AI for Education Research

Related Projects

Using Large Language Models and Generative AI to Scale Qualitative Data Analysis

How can researchers combine qualitative judgment with open-source generative AI to scale thematic analysis without hiding methodological choices?

Project

CAREER: Minds and Machines: Exploring Engineering Faculty Member Mental Models of Generative AI and Instructional Decisions

How do engineering faculty understand generative AI, and how do those mental models shape instructional decisions?

Project

Structures and Machines: An Interdisciplinary Approach to Mapping the Policy Implications of Generative AI in Higher Education

Analyzing existing policies related to AI in higher education across the US

Project

All publications