When Robots Should Say "I Don't Know": Benchmarking Abstention in Embodied Question Answering
arxiv.org·3d
🧠Intelligence Compression
Preview
Report Post

View PDF HTML (experimental)

Abstract:Embodied Question Answering (EQA) requires an agent to interpret language, perceive its environment, and navigate within 3D scenes to produce responses. Existing EQA benchmarks assume that every question must be answered, but embodied agents should know when they do not have sufficient information to answer. In this work, we focus on a minimal requirement for EQA agents, abstention: knowing when to withhold an answer. From an initial study of 500 human queries, we find that 32.4% contain missing or underspecified context. Drawing on this initial study and cognitive theories of human communication errors, we derive five representative categories requiring abstention: …

Similar Posts

Loading similar posts...