Neural Networks

Feeds to Scour
SubscribedAll
Scoured 100 posts in 8.8 ms

Gradient descent at the Edge of Stability: free energy model and kinetic description of the two-layer network

 🧠Deep Learning  Content type: Academic
arxiv.org·

Generalization in Deep Neural Networks: Minimax Rates for Gradient Methods

 🧠Deep Learning  Content type: Academic
arxiv.org·

KJLdefeated/RL.cu: RLVR training for LLM in CUDA/C++

 Flash Attention  Content type: Code
github.com··Hacker News

Flatland: The Adventures of Gradient Descent with Large Step Sizes

 🧠Deep Learning  Content type: Academic
arxiv.org·

Projected Inverse Iteration: An Eigenvalue Approach to Ground-State Computation with Neural Quantum States

 🧠Deep Learning  Content type: Academic
arxiv.org·

PC Layer: Polynomial Weight Preconditioning for Improving LLM Pre-Training

 💬LLMs  Content type: Academic
arxiv.org·

Synthetic Benchmarks Overstate Forward-Forward Scaling: Real-Data Limits of Layer-Local Training

 🤖AI  Content type: Academic
arxiv.org·

Learning Dynamics Reveal a Hierarchy of Weight-Induced Layerwise Gram Metrics

 🤖AI  Content type: Academic
arxiv.org·

An Ensembled Latent Factor Model via Differential Evolution and Gradient Descent Optimization

 🤖Machine Learning  Content type: Academic
arxiv.org·

Multilevel Stochastic Gradient Descent for Risk-Averse PDE-Constrained Optimization

 📈Optimization  Content type: Academic
arxiv.org·

Second-Order Path Kernel Interpolation Formulas in Machine Learning

 🤖Machine Learning  Content type: Academic
arxiv.org·

DBHN-Net: Dual-Branch Hybrid Neural Network For Low-Complexity Monaural Speech Enhancement

 🤖AI  Content type: Academic
arxiv.org·

Predictive Coding with Bayesian Priors via Proximal Gradients

 🎲Probability  Content type: Academic
arxiv.org·

Quantifying Uncertainty In Wide Two-Layer Neural Networks: On The Law Of The Limiting Fluctuation Process

 🤖AI  Content type: Academic
arxiv.org·

Fourier fractal dimension to predict the generalization of deep neural networks

 🤖AI  Content type: Academic
arxiv.org·

Pretraining Recurrent Networks without Recurrence

 🤖AI  Content type: Academic
arxiv.org·

Uniform Stability and Generalization Error of GD and SGD on Fixed-Point Parameters

 📈Optimization  Content type: Academic
arxiv.org·

Pseudospectral Bounds for Transient Amplification in Coupled Gradient Descent

 🤖Machine Learning  Content type: Academic
arxiv.org·

Attention at the Theoretical Minimum: A Mathematics of Arrays Framework for Memory-Optimal Transformer Kernels

 🤖AI  Content type: Academic
arxiv.org·

AI from concrete to abstract: demystifying artificial intelligence to the general public

 🤖AI  Content type: Academic
arxiv.org·

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help