RLVMR: Reinforcement Learning with Verifiable Meta-Reasoning Rewards for Robust Long-Horizon Agents
arxiv.orgยท3d
Dimensions of Vulnerability in Visual Working Memory: An AI-Driven Approach to Perceptual Comparison
arxiv.orgยท3d
RecUserSim: A Realistic and Diverse User Simulator for Evaluating Conversational Recommender Systems
arxiv.orgยท2d
The Pragmatic Mind of Machines: Tracing the Emergence of Pragmatic Competence in Large Language Models
arxiv.orgยท2d
Multi-Stage Verification-Centric Framework for Mitigating Hallucination in Multi-Modal RAG
arxiv.orgยท5d
LLM-Crowdsourced: A Benchmark-Free Paradigm for Mutual Evaluation of Large Language Models
arxiv.orgยท3d
Explainability Through Systematicity: The Hard Systematicity Challenge for Artificial Intelligence
arxiv.orgยท3d
Loading...Loading more...