MedConsultBench: A Full-Cycle, Fine-Grained, Process-Aware Benchmark for Medical Consultation Agents
arxiv.org·1d
🏆LLM Benchmarking
Preview
Report Post

arXiv:2601.12661v1 Announce Type: new Abstract: Current evaluations of medical consultation agents often prioritize outcome-oriented tasks, frequently overlooking the end-to-end process integrity and clinical safety essential for real-world practice. While recent interactive benchmarks have introduced dynamic scenarios, they often remain fragmented and coarse-grained, failing to capture the structured inquiry logic and diagnostic rigor required in professional consultations. To bridge this gap, we propose MedConsultBench, a comprehensive framework designed to evaluate the complete online consultation cycle by covering the entire clinical workflow from history taking and diagnosis to treatment planning and follow-up Q&A. Our methodology introduces Atomic Information Units (AIUs) to track c…

Similar Posts

Loading similar posts...

Keyboard Shortcuts

Navigation
Next / previous item
j/k
Open post
oorEnter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help