Filtering with Self-Attention and Storing with MLP: One-Layer Transformers Can Provably Acquire and Extract Knowledge
arxiv.org·18h
SynAdapt: Learning Adaptive Reasoning in Large Language Models via Synthetic Continuous Chain-of-Thought
arxiv.org·1d
Rein++: Efficient Generalization and Adaptation for Semantic Segmentation with Vision Foundation Models
arxiv.org·18h
A Formal Framework for the Definition of 'State': Hierarchical Representation and Meta-Universe Interpretation
arxiv.org·18h
MAP: Mitigating Hallucinations in Large Vision-Language Models with Map-Level Attention Processing
arxiv.org·18h
When Truth Is Overridden: Uncovering the Internal Origins of Sycophancy in Large Language Models
arxiv.org·18h
Loading...Loading more...