ViBES: A Conversational Agent with Behaviorally-Intelligent 3D Virtual Body
arxiv.org·2d
🎭Anthropic Claude
Preview
Report Post

View PDF HTML (experimental)

Abstract:Human communication is inherently multimodal and social: words, prosody, and body language jointly carry intent. Yet most prior systems model human behavior as a translation task co-speech gesture or text-to-motion that maps a fixed utterance to motion clips-without requiring agentic decision-making about when to move, what to do, or how to adapt across multi-turn dialogue. This leads to brittle timing, weak social grounding, and fragmented stacks where speech, text, and motion are trained or inferred in isolation. We introduce ViBES (Voice in Behavioral Expression and Synchrony), a conversational 3D agent that jointly plans language and movement and executes dialogue-…

Similar Posts

Loading similar posts...