Emergence of Superposition: Unveiling the Training Dynamics of Chain of Continuous Thought
arxiv.org·23h
The Debate on RLVR Reasoning Capability Boundary: Shrinkage, Expansion, or Both? A Two-Stage Dynamic View
arxiv.org·23h
Feasibility-Aware Decision-Focused Learning for Predicting Parameters in the Constraints
arxiv.org·23h
PsycholexTherapy: Simulating Reasoning in Psychotherapy with Small Language Models in Persian
arxiv.org·23h
Loading...Loading more...