Machine Learning

Feeds to Scour
SubscribedAll
Scoured 78 posts in 11.6 ms

Gram Newton-Schulz: A Fast, Hardware-Aware Newton-Schulz Algorithm for Muon

馃搻Linear AlgebraContent type: Blog
tridao.meHacker News

Exploring the Design Space of Reward Backpropagation for Flow Matching

馃AIContent type: Academic
arxiv.org
Less-relevant results

Agentic RL: Token-In, Token-Out Done Right

馃幃Reinforcement Learning

Optimal Rates for Generalization of Gradient Descent Methods with Deep Neural Networks

馃搻Optimization TheoryContent type: Academic
arxiv.org

Designing Loops That Prompt Coding Agents: The Six I Actually Run

鉁嶏笍Prompt Engineering

Overcoming Rank Collapse in Feedback Alignment

馃AIContent type: Academic
arxiv.org

See, Act, Correct: three levers for working with a code agent

馃幃Reinforcement LearningContent type: Blog

Growing Pains of Starting a Secret Society

馃搻Optimization TheoryContent type: Blog

Flatland: The Adventures of Gradient Descent with Large Step Sizes

馃搻Optimization TheoryContent type: Academic
arxiv.org

Second-Order Path Kernel Interpolation Formulas in Machine Learning

馃搻Optimization TheoryContent type: Academic
arxiv.org

Stein Kernelized Molecular Dynamics for Active Learning of Interatomic Potentials

馃搻Optimization TheoryContent type: Academic
arxiv.org

Fourier fractal dimension to predict the generalization of deep neural networks

馃搻Optimization TheoryContent type: Academic
arxiv.org

Structured Adaptive Tensor Prediction for Streaming Data

馃摱CommunicationsContent type: Academic
arxiv.org

A Mean-Field Analysis of Multi-Head Self-Attention under Cross-Entropy Training

馃TransformersContent type: Academic
arxiv.org

Phantom transitions in language model fine-tuning

馃挰LLMsContent type: Academic
arxiv.org

mingusb/transformer-golf: The Fully Unrolled Transformer: An experimental repository for architecture simplification and compilation. [2026]

馃TransformersContent type: Code
github.comHacker News

Uniform Stability and Generalization Error of GD and SGD on Fixed-Point Parameters

馃搻Optimization TheoryContent type: Academic
arxiv.org

Pseudospectral Bounds for Transient Amplification in Coupled Gradient Descent

馃搻Optimization TheoryContent type: Academic
arxiv.org

Reinforcement Learning for Flow-Matching Policies with Density Transport

馃AIContent type: Academic
arxiv.org

An Ensembled Latent Factor Model via Differential Evolution and Gradient Descent Optimization

馃搻Optimization TheoryContent type: Academic
arxiv.org

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help