SliceMoE: Routing Embedding Slices Instead of Tokens for Fine-Grained and Balanced Transformer Scaling
arxiv.org·5d
Mitigating Premature Exploitation in Particle-based Monte Carlo for Inference-Time Scaling
arxiv.org·4d
GPT-5-Codex is a better AI researcher than me
seangoedecke.com·5d
Pathology-CoT: Learning Visual Chain-of-Thought Agent from Expert Whole Slide Image Diagnosis Behavior
arxiv.org·5d
Loading...Loading more...