Machine Learning

Feeds to Scour
SubscribedAll
Scoured 78 posts in 13.4 ms

Agentic RL: Token-In, Token-Out Done Right

馃幃Reinforcement Learning

Designing Loops That Prompt Coding Agents: The Six I Actually Run

鉁嶏笍Prompt Engineering

KJLdefeated/RL.cu: RLVR training for LLM in CUDA/C++

馃AIContent type: Code
github.comHacker News

Phantom transitions in language model fine-tuning

馃挰LLMsContent type: Academic
arxiv.org

Optimal Rates for Generalization of Gradient Descent Methods with Deep Neural Networks

馃搻Optimization TheoryContent type: Academic
arxiv.org

Growing Pains of Starting a Secret Society

馃搻Optimization TheoryContent type: Blog

See, Act, Correct: three levers for working with a code agent

馃幃Reinforcement LearningContent type: Blog

Reinforcement Learning for Flow-Matching Policies with Density Transport

馃AIContent type: Academic
arxiv.org

Flatland: The Adventures of Gradient Descent with Large Step Sizes

馃搻Optimization TheoryContent type: Academic
arxiv.org

Variational Proximal Policy Optimization

馃幃Reinforcement LearningContent type: Academic
arxiv.org

Second-Order Path Kernel Interpolation Formulas in Machine Learning

馃搻Optimization TheoryContent type: Academic
arxiv.org

Learning Dynamics Reveal a Hierarchy of Weight-Induced Layerwise Gram Metrics

馃搻Optimization TheoryContent type: Academic
arxiv.org

Predictive Coding with Bayesian Priors via Proximal Gradients

馃搻Optimization TheoryContent type: Academic
arxiv.org

Stein Kernelized Molecular Dynamics for Active Learning of Interatomic Potentials

馃搻Optimization TheoryContent type: Academic
arxiv.org

Understanding Quantization-Aware Training: Gradients at Quantized Weights Bias to the Low-Loss Basin

馃搲Loss LandscapesContent type: Academic
arxiv.org

princezuda/-RequiemGPT-: Fully open source and open weights built and trained by fable five with one prompt. An experience in how AI actually works

馃AIContent type: Code
github.comHacker News

Duality for Optimal Multi-Item, Multi-Bidder Auction Design: Revenue Certificates through Deep Learning

馃搻Optimization TheoryContent type: Academic
arxiv.org

Pseudospectral Bounds for Transient Amplification in Coupled Gradient Descent

馃搻Optimization TheoryContent type: Academic
arxiv.org

Hybridizing Equilibrium Propagation with Ising Machines for Efficient Energy-Based Learning

馃AIContent type: Academic
arxiv.org

An Ensembled Latent Factor Model via Differential Evolution and Gradient Descent Optimization

馃搻Optimization TheoryContent type: Academic
arxiv.org

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help