Optimization

Convex Optimization, Loss Functions, Gradient Methods, Adam Optimizer

Feeds to Scour
SubscribedAll
Scoured 87 posts in 12.3 ms

Multilevel Stochastic Gradient Descent for Risk-Averse PDE-Constrained Optimization

 🎲Probability  Content type: Academic
arxiv.org·

From SGD to Muon: An Incremental Tutorial (Fable-5)

 🧠Neural Networks  Content type: Blog

Adaptive Learning Rates with Surrogate Probability for Follow-the-Perturbed-Leader

 🎯RLHF  Content type: Academic
arxiv.org·

A Mean-Field Analysis of Multi-Head Self-Attention under Cross-Entropy Training

 Transformers  Content type: Academic
arxiv.org·

A Theory on Flow Matching with Neural Networks

 🤖AI  Content type: Academic
arxiv.org·

An Ensembled Latent Factor Model via Differential Evolution and Gradient Descent Optimization

 🤖Machine Learning  Content type: Academic
arxiv.org·

Uniform Stability and Generalization Error of GD and SGD on Fixed-Point Parameters

 🤖Machine Learning  Content type: Academic
arxiv.org·

Forward-Only Convolutional Neural Networks with Learnable Channel-Class Assignment

 🧠Deep Learning  Content type: Academic
arxiv.org·

Optimal Rates for Generalization of Gradient Descent Methods with Deep Neural Networks

 🧠Deep Learning  Content type: Academic
arxiv.org·

Pseudospectral Bounds for Transient Amplification in Coupled Gradient Descent

 🤖Machine Learning  Content type: Academic
arxiv.org·

Flatland: The Adventures of Gradient Descent with Large Step Sizes

 🧠Deep Learning  Content type: Academic
arxiv.org·

Second-Order Path Kernel Interpolation Formulas in Machine Learning

 🤖Machine Learning  Content type: Academic
arxiv.org·

Revisiting Privacy Amplification by Subsampling in Selective Release DPSGD

 🤖Machine Learning  Content type: Academic
arxiv.org·

Predictive Coding with Bayesian Priors via Proximal Gradients

 🎲Probability  Content type: Academic
arxiv.org·

When Both Layers Learn: Training Dynamics of Representing Linear Models via ReLU Networks

 🧠Neural Networks  Content type: Academic
arxiv.org·

Projected Inverse Iteration: An Eigenvalue Approach to Ground-State Computation with Neural Quantum States

 🧠Deep Learning  Content type: Academic
arxiv.org·

Noise-Adaptive High-Probability Regret Bounds for Online Convex Optimization

 🎲Stochastic Processes  Content type: Academic
arxiv.org·

Gradient descent at the Edge of Stability: free energy model and kinetic description of the two-layer network

 🧠Neural Networks  Content type: Academic
arxiv.org·

Characterizing Learning Dynamics under Relative Reparameterization of Singular Models

 🤖AI  Content type: Academic
arxiv.org·

Fourier fractal dimension to predict the generalization of deep neural networks

 🤖AI  Content type: Academic
arxiv.org·

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help