Exploring NLP-based Methods for Generating Engineering Ethics Assessment Qualitative Codebooks

Contribution Summary

The paper offers an early comparison of assisted and more automated NLP workflows for qualitative codebook generation in engineering ethics assessment.

Draft enrichment generated from extracted publication text; pending human review.

Plain-Language Summary

This paper compares two NLP-supported approaches for generating qualitative codebooks from engineering ethics reflection data. It uses student responses from a technology ethics course to examine how human-NLP collaboration and more automated NLP workflows produce similar and different themes.

Research Question

How effective are NLP-supported methods for generating a qualitative codebook for student reflections on technology ethics?

Methods

Analyzed open-ended student responses collected across six iterations of a semester-long technology ethics course.
Compared a Human-NLP workflow, where researchers revised model-generated themes, with an Auto-NLP workflow using iterative embedding, clustering, and summarization.
Used Python, sentence-transformer embeddings, agglomerative clustering, and an LLM summarization process to generate candidate themes.

Key Findings

The Human-NLP method produced eight final themes grouped into three overarching themes.
The Auto-NLP method produced twelve final themes grouped into four overarching themes, with substantial overlap across methods.
Both approaches highlighted students' learning about ethics, technology-society connections, and organizational responsibility, while also showing risks around context loss and non-determinism.

Implications

NLP can help researchers make sense of large text corpora, but generated codebooks need researcher review before use.

Method choices such as clustering algorithms, model settings, and human review steps materially affect qualitative outputs.

Engineering ethics assessment can benefit from scalable qualitative workflows that preserve attention to course context and student meaning.

Research Artifacts

protocolHuman-NLP codebook generation workflowA human-in-the-loop workflow for generating, reviewing, and revising qualitative themes.

protocolAuto-NLP codebook generation workflowAn iterative embedding, clustering, and summarization workflow for automated candidate codebook generation.

Abstract

Publication on Exploring NLP-based Methods for Generating Engineering Ethics Assessment Qualitative Codebooks

Related Projects

Using Large Language Models and Generative AI to Scale Qualitative Data Analysis

How can researchers combine qualitative judgment with open-source generative AI to scale thematic analysis without hiding methodological choices?

Project

EAGER: Natural Language Processing for Teaching and Research in Engineering Education (NLPTREE)

How can NLP methods help engineering education researchers and instructors analyze text-rich learning data responsibly and at scale?

Project

Operationalizing, Validating, and Scaling Health Systems Citizenship Assessment in Undergraduate Medical Education

Developing NLP and AI-based tools to support assessment of health systems citizenship and to characterize medical students' mental models of health systems.

Project

All publications