Journal Article

Exploring the Efficacy of ChatGPT in Analyzing Student Teamwork Feedback with an Existing Taxonomy

A. Katz, S. Wei, G. Nanda, C. Brinton, M. Ohland

arXiv preprint2023

Contribution Summary

The paper provides early empirical evidence on using generative text models for deductive labeling of open-ended teamwork feedback.

Draft enrichment generated from extracted publication text; pending human review.

Plain-Language Summary

This study tests whether GPT-3.5 can classify open-ended student teamwork feedback using an existing taxonomy. It also examines whether the model can assess the accuracy of its own labels so instructors or researchers can focus human review on questionable cases.

Research Question

How well does GPT-3.5 match human labels when classifying student teamwork feedback into a predetermined taxonomy, and how well do its self-rated accuracy scores correspond to human evaluations?

Methods

  • Sampled 200 student teamwork feedback comments and prompted GPT-3.5-turbo to apply labels from an existing teamwork feedback taxonomy.
  • Had researchers rate the model's labels as accurate, unclear, or inaccurate.
  • Prompted the model to rate the accuracy of its own labels on a ten-point scale and compared those ratings with human judgments.

Key Findings

  • Researchers judged 85% of the model-generated labels as accurate, 8% as unclear, and 7% as inaccurate.
  • The model handled many semantically similar comments well, but sometimes defaulted to the first label or missed negative sentiment.
  • The self-checking step tended to flag some questionable labels, suggesting a possible triage workflow for human review.

Implications

Generative models can help classify open-ended student comments without forcing students into closed-response formats.

Human judgment remains necessary for ambiguous, negative, or context-dependent feedback comments.

Future systems should test label ordering, sentiment handling, privacy constraints, and local model alternatives.

Research Artifacts

protocolTeamwork feedback labeling promptPrompting workflow for applying an existing taxonomy to open-ended student teamwork comments.
protocolModel self-check promptPrompting workflow for asking the model to rate the accuracy of its own generated labels.

Abstract

Publication on Exploring the Efficacy of ChatGPT in Analyzing Student Teamwork Feedback with an Existing Taxonomy

Related Projects