PilotRL: Training Language Model Agents via Global Planning-Guided Progressive Reinforcement Learning
arxiv.org·19h
Court of LLMs: Evidence-Augmented Generation via Multi-LLM Collaboration for Text-Attributed Graph Anomaly Detection
arxiv.org·19h
MetaExplainer: A Framework to Generate Multi-Type User-Centered Explanations for AI Systems
arxiv.org·19h
Rethinking Evidence Hierarchies in Medical Language Benchmarks: A Critical Evaluation of HealthBench
arxiv.org·19h
Trustworthy Reasoning: Evaluating and Enhancing Factual Accuracy in LLM Intermediate Thought Processes
arxiv.org·3d
From EMR Data to Clinical Insight: An LLM-Driven Framework for Automated Pre-Consultation Questionnaire Generation
arxiv.org·19h
Co-Reward: Self-supervised Reinforcement Learning for Large Language Model Reasoning via Contrastive Agreement
arxiv.org·19h
HateBuffer: Safeguarding Content Moderators' Mental Well-Being through Hate Speech Content Modification
arxiv.org·19h
Model Misalignment and Language Change: Traces of AI-Associated Language in Unscripted Spoken English
arxiv.org·19h
Loading...Loading more...