DoVer: Intervention-Driven Auto Debugging for LLM Multi-Agent Systems
arxiv.org·2d
⚙️TLA+
Preview
Report Post

View PDF HTML (experimental)

Abstract:Large language model (LLM)-based multi-agent systems are challenging to debug because failures often arise from long, branching interaction traces. The prevailing practice is to leverage LLMs for log-based failure localization, attributing errors to a specific agent and step. However, this paradigm has two key limitations: (i) log-only debugging lacks validation, producing untested hypotheses, and (ii) single-step or single-agent attribution is often ill-posed, as we find that multiple distinct interventions can independently repair the failed task. To address the first limitation, we introduce DoVer, an intervention-driven debugging framework, which augments hypot…

Similar Posts

Loading similar posts...