vector search
HealthBranches: Synthesizing Clinically-Grounded Question Answering Datasets via Decision Pathways
arxiv.orgยท2d
MDK12-Bench: A Comprehensive Evaluation of Multimodal Large Language Models on Multidisciplinary Exams
arxiv.orgยท2d
Affordance-R1: Reinforcement Learning for Generalizable Affordance Reasoning in Multimodal Large Language Model
arxiv.orgยท3d
MIND: A Noise-Adaptive Denoising Framework for Medical Images Integrating Multi-Scale Transformer
arxiv.orgยท2d
Democratizing Diplomacy: A Harness for Evaluating Any Large Language Model on Full-Press Diplomacy
arxiv.orgยท2d
G-UBS: Towards Robust Understanding of Implicit Feedback via Group-Aware User Behavior Simulation
arxiv.orgยท3d
Loading...Loading more...