ECLipsE-Gen-Local: Efficient Compositional Local Lipschitz Estimates for Deep Neural Networks
arxiv.org·1d
Emergence of Superposition: Unveiling the Training Dynamics of Chain of Continuous Thought
arxiv.org·2d
From Neural Activity to Computation: Biological Reservoirs for Pattern Recognition in Digit Classification
arxiv.org·1d
Think Natively: Unlocking Multilingual Reasoning with Consistency-Enhanced Reinforcement Learning
arxiv.org·11h
SliceMoE: Routing Embedding Slices Instead of Tokens for Fine-Grained and Balanced Transformer Scaling
arxiv.org·2d
Loading...Loading more...