Using Large Language Models to Detect Socially Shared Regulation of Collaborative Learning

View PDF HTML (experimental)

Abstract:The field of learning analytics has made notable strides in automating the detection of complex learning processes in multimodal data. However, most advancements have focused on individualized problem-solving instead of collaborative, open-ended problem-solving, which may offer both affordances (richer data) and challenges (low cohesion) to behavioral prediction. Here, we extend predictive models to automatically detect socially shared regulation of learning (SSRL) behaviors in collaborative computational modeling environments using embedding-based approaches. We leverage large language models (LLMs) as summarization tools to generate task-aware representations o…

View PDF HTML (experimental)

Abstract:The field of learning analytics has made notable strides in automating the detection of complex learning processes in multimodal data. However, most advancements have focused on individualized problem-solving instead of collaborative, open-ended problem-solving, which may offer both affordances (richer data) and challenges (low cohesion) to behavioral prediction. Here, we extend predictive models to automatically detect socially shared regulation of learning (SSRL) behaviors in collaborative computational modeling environments using embedding-based approaches. We leverage large language models (LLMs) as summarization tools to generate task-aware representations of student dialogue aligned with system logs. These summaries, combined with text-only embeddings, context-enriched embeddings, and log-derived features, were used to train predictive models. Results show that text-only embeddings often achieve stronger performance in detecting SSRL behaviors related to enactment or group dynamics (e.g., off-task behavior or requesting assistance). In contrast, contextual and multimodal features provide complementary benefits for constructs such as planning and reflection. Overall, our findings highlight the promise of embedding-based models for extending learning analytics by enabling scalable detection of SSRL behaviors, ultimately supporting real-time feedback and adaptive scaffolding in collaborative learning environments that teachers value.


Comments:	Short research paper accepted at Learning Analytics and Knowledge (LAK ’26)
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2601.04458 [cs.LG]
	(or arXiv:2601.04458v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2601.04458 arXiv-issued DOI via DataCite (pending registration)
Related DOI:	https://doi.org/10.1145/3785022.3785083 DOI(s) linking to related resources

Submission history

From: Conrad Borchers [view email] [v1] Thu, 8 Jan 2026 00:30:46 UTC (1,556 KB)

Submission history

Similar Posts