Authors: Namrata Srivastava and Clayton Cohn, Vanderbilt University
Key Ideas
- Conversational agents must be pedagogically grounded, not just technologically advanced
- Learning theories guide when and how agents should scaffold
- Theory-aligned agents support adaptive, trustworthy classroom interactions
Adaptive scaffolding, or support that changes based on what learners need at any certain moment, has long been a goal of educational technology. In real classrooms, however, delivering timely, personalized support is hard. Teachers must interpret student speech, actions, and collaboration patterns across multiple groups—often simultaneously.
Conversational agents (CAs) powered by large language models (LLMs) offer new opportunities to support students during open-ended learning activities. However, fluency alone does not make CAs educationally useful. In our work as researchers at Vanderbilt University, we focus on ensuring that CAs interact with learners in ways that are pedagogically sound and consistent with established learning theories, rather than simply generating contextually plausible responses.
From Conversational Ability to Pedagogical Intent
Dialogue-based learning systems have a long history in education. Early intelligent tutors and teachable agents demonstrated how conversation can prompt explanation, reflection, and self-regulation. These systems were deeply grounded in learning theory but were often limited by rigid dialogue scripts.
LLMs remove many of these linguistic constraints, such as fixed dialogue scripts and rule-based responses, enabling agents to respond flexibly to open-ended student input. However, this flexibility also introduces risk. Without theoretical grounding, agents may intervene at inappropriate times, provide overly directive feedback, or undermine students’ critical thinking opportunities. Our work addresses this concern by explicitly aligning agent interactions with learning science principles.
Grounding Agent Interactions in Learning Theory
We focus on configuring the architecture of CAs so that their output reflects how learning is known to occur. In our work within open-ended learning environments, we use the following learning theories to shape what the agent attends to, when it intervenes, and how it frames its responses.
- Evidence-Centered Design (ECD) guides how learning evidence is interpreted based on teachers’ wishes and curricular goals. Student actions and dialogue are treated as evidence of underlying knowledge, skills, and problem-solving strategies, helping constrain agent responses to instructionally relevant goals.
- The Zone of Proximal Development (ZPD) informs the timing and level of support. The agent is configured to respond to observable indicators of difficulty or progress, aiming to support learners just beyond what they can do independently, without prematurely stepping in.
- Social Cognitive Theory (SCT) and Socially Shared Regulation of Learning (SSRL) shape how support is delivered. Agent prompts are designed to encourage goal setting, reflection, strategy use, and self-efficacy, while also supporting collaborative processes such as equitable participation and shared monitoring.
A Theory-Informed CA Architecture
We are studying CAs in open-ended STEM+C environments (such as C2STEM) using a multi-agent framework organized around three tightly connected components: a Learner Model, a Dialogue Manager, and a Domain Knowledge Base, coordinated through LLMs.
The Domain Knowledge Base grounds agent responses in vetted curricular content. Using retrieval-augmented generation (RAG), it stores and retrieves domain concepts and adapts scaffolding as students progress. This grounding helps reduce hallucination, preserve transparency, and ensure that agent feedback remains instructionally meaningful.
The Learner Model represents students’ evolving understanding, strategies, and collaboration patterns. Rather than focusing only on conceptual understanding, it integrates multimodal learning analytics, such as system logs and student dialogue, to detect inflection points. These moments indicate potential difficulty, disengagement, or breakdowns in regulation. Importantly, the learner model also helps distinguish productive persistence from stagnation, preventing over-scaffolding.
The Dialogue Manager orchestrates interaction between the learner model, the LLM agents, and the knowledge base. It maintains conversational context, applies pedagogical policies, and respects teacher preferences. When intervention is appropriate, it composes scaffolds that are actionable, developmentally appropriate, and aligned with student needs.
Across interactions, the agent supports learners at both cognitive and metacognitive levels, while also recognizing when to step back and allow productive struggle. Support gradually reduces as students demonstrate increased competence and confidence.
Toward Pedagogically Sound Classroom Agents
Our goal is not to replace teachers or automate instruction. Instead, we aim to ensure that conversational agents behave in ways educators would recognize as instructionally appropriate, supportive, and aligned with how learning unfolds in real classrooms. By aligning agent interactions with established learning theories, we aim to support adaptive, trustworthy, and classroom-ready AI that complements teachers’ expertise.