LLMs Can Covertly Sandbag on Capability Evaluations Against Chain-of-Thought Monitoring
arxiv.orgยท18h
A Formal Framework for the Definition of 'State': Hierarchical Representation and Meta-Universe Interpretation
arxiv.orgยท18h
Lightweight Backbone Networks Only Require Adaptive Lightweight Self-Attention Mechanisms
arxiv.orgยท18h
Co-Reward: Self-supervised Reinforcement Learning for Large Language Model Reasoning via Contrastive Agreement
arxiv.orgยท1d
AI-Educational Development Loop (AI-EDL): A Conceptual Framework to Bridge AI Capabilities with Classical Educational Theories
arxiv.orgยท18h
Patho-AgenticRAG: Towards Multimodal Agentic Retrieval-Augmented Generation for Pathology VLMs via Reinforcement Learning
arxiv.orgยท18h
MARS: A Meta-Adaptive Reinforcement Learning Framework for Risk-Aware Multi-Agent Portfolio Management
arxiv.orgยท18h
EHSAN: Leveraging ChatGPT in a Hybrid Framework for Arabic Aspect-Based Sentiment Analysis in Healthcare
arxiv.orgยท18h
Quantum-RAG and PunGPT2: Advancing Low-Resource Language Generation and Retrieval for the Punjabi Language
arxiv.orgยท18h
Loading...Loading more...