Contribution Summary
The paper offers evidence about both the promise and limits of using local LLMs as rubric raters, while also extending feedback literacy theory to peer feedback in first-year engineering.
Draft enrichment generated from extracted publication text; pending human review.
Plain-Language Summary
This study tests whether a local open-source large language model can apply a feedback quality rubric to first-year engineering peer feedback comments. It uses that methodological test to examine what students' comments reveal about feedback literacy and self-regulated learning.
Research Question
How reliably can an open-source local LLM apply a feedback quality rubric to engineering student peer feedback comments, and what do those ratings reveal about first-year students' feedback quality?
Methods
- Applied a four-criteria feedback quality rubric to 295 peer feedback comments from a first-year engineering course.
- Piloted multiple open-source local models and used qwen2.5-32b to rate comments against the rubric criteria.
- Compared model ratings with researcher ratings using Cohen's quadratic weighted kappa and analyzed the resulting scores and comment typology.
Key Findings
- The LLM produced coherent ratings but was less reliable than human researchers overall, especially on more subjective criteria.
- The model reached strong agreement on the Action criterion and behaved in ways similar to a novice rater applying a new rubric.
- Students' feedback comments were generally low to medium quality, with constructive elements such as performance gaps and actionable next steps appearing less often than positive descriptions.
Implications
LLM-supported qualitative rating requires careful rubric design, prompt iteration, pilot testing, and human oversight before substantive claims are made.
Local open-source models can support privacy-sensitive educational research workflows, but model choice and compute requirements remain important constraints.
Engineering instructors should explicitly teach students how to write feedback that includes specific behaviors, performance gaps, and actionable next steps.
Research Artifacts
Abstract
This study investigates the reliability and utility of generative AI to apply a feedback quality rubric to peer feedback comments from first-year engineering students, exploring how LLMs can assist in qualitative analysis.
Related Projects
Using Large Language Models and Generative AI to Scale Qualitative Data Analysis
How can researchers combine qualitative judgment with open-source generative AI to scale thematic analysis without hiding methodological choices?
CAREER: Minds and Machines: Exploring Engineering Faculty Member Mental Models of Generative AI and Instructional Decisions
How do engineering faculty understand generative AI, and how do those mental models shape instructional decisions?