Journal Article

Using generative AI for large-scale qualitative analysis of social media posts to understand why people leave computer science

Amanda Ross, Andrew Katz

Journal of Engineering Education2025Featured
View Paper

Contribution Summary

The paper connects a scalable AI-assisted qualitative workflow with a substantive account of computer science attrition across education and career contexts.

Draft enrichment generated from extracted publication text; pending human review.

Plain-Language Summary

This study uses Reddit posts and a generative-AI-assisted qualitative workflow to examine why people leave computer science across academic, transitional, and professional stages. The analysis narrows more than 10,000 scraped posts to 263 relevant posts, then uses AI-supported thematic analysis and human interpretation through social cognitive career theory.

Research Question

What reasons and external or contextual factors influence individuals' decisions to leave computer science across different departure stages?

Methods

  • Scraped Reddit posts from 25 subreddits using CS-related and departure-related keywords, yielding 10,384 posts after deduplication.
  • Used generative AI to summarize, filter, label departure stages, extract decision factors, and support codebook generation through the GATOS workflow.
  • Integrated human review and social cognitive career theory to interpret and contextualize the AI-generated themes.

Key Findings

  • Reasons for leaving included job dissatisfaction, interest in other fields, psychological or emotional factors, academic struggles, health and well-being concerns, and industry issues.
  • Decision factors included personal background, transition requirements, the nature of alternative careers, and personal circumstances.
  • The same broad reasons and factors appeared across departure stages, although their emphasis varied by stage.

Implications

Retention work in computer science should address workplace conditions, academic pathways, psychological well-being, and transition barriers rather than treating attrition as a single-stage pipeline problem.

Social media data can surface candid accounts of career decisions that may be difficult to collect through interviews alone.

AI-assisted qualitative workflows can support large-scale analysis when paired with human interpretation and theoretical framing.

Research Artifacts

protocolReddit scraping and analysis promptsAppendices document subreddit selection and prompts for summarizing, filtering, labeling, extracting decision factors, generating codes, and identifying themes.
appendixFinal codebookThe paper appendix reports the final codebook used to organize reasons for leaving and decision factors.

Abstract

This study uses generative AI for large-scale qualitative analysis of over 10,000 Reddit posts to understand diverse reasons why people leave computer science, including job dissatisfaction and influential factors at different stages of departure.

Related Projects