Optimization

Feeds to Scour
SubscribedAll
Scoured 86 posts in 9.6 ms

Multilevel Stochastic Gradient Descent for Risk-Averse PDE-Constrained Optimization

馃幉ProbabilityContent type: Academic
arxiv.org

From SGD to Muon: An Incremental Tutorial (Fable-5)

馃Neural NetworksContent type: Blog
sankalp.bearblog.dev

Adaptive Learning Rates with Surrogate Probability for Follow-the-Perturbed-Leader

馃幆RLHFContent type: Academic
arxiv.org

A Theory on Flow Matching with Neural Networks

馃AIContent type: Academic
arxiv.org

Forward-Only Convolutional Neural Networks with Learnable Channel-Class Assignment

馃Deep LearningContent type: Academic
arxiv.org

An Ensembled Latent Factor Model via Differential Evolution and Gradient Descent Optimization

馃Machine LearningContent type: Academic
arxiv.org

Uniform Stability and Generalization Error of GD and SGD on Fixed-Point Parameters

馃Machine LearningContent type: Academic
arxiv.org

Optimal Rates for Generalization of Gradient Descent Methods with Deep Neural Networks

馃Deep LearningContent type: Academic
arxiv.org

Pseudospectral Bounds for Transient Amplification in Coupled Gradient Descent

馃Machine LearningContent type: Academic
arxiv.org

Flatland: The Adventures of Gradient Descent with Large Step Sizes

馃Deep LearningContent type: Academic
arxiv.org

Second-Order Path Kernel Interpolation Formulas in Machine Learning

馃Machine LearningContent type: Academic
arxiv.org

Revisiting Privacy Amplification by Subsampling in Selective Release DPSGD

馃Machine LearningContent type: Academic
arxiv.org

Predictive Coding with Bayesian Priors via Proximal Gradients

馃幉ProbabilityContent type: Academic
arxiv.org

When Both Layers Learn: Training Dynamics of Representing Linear Models via ReLU Networks

馃Neural NetworksContent type: Academic
arxiv.org

Projected Inverse Iteration: An Eigenvalue Approach to Ground-State Computation with Neural Quantum States

馃Deep LearningContent type: Academic
arxiv.org

Noise-Adaptive High-Probability Regret Bounds for Online Convex Optimization

馃幉Stochastic ProcessesContent type: Academic
arxiv.org

Gradient descent at the Edge of Stability: free energy model and kinetic description of the two-layer network

馃Neural NetworksContent type: Academic
arxiv.org

Characterizing Learning Dynamics under Relative Reparameterization of Singular Models

馃AIContent type: Academic
arxiv.org

Fourier fractal dimension to predict the generalization of deep neural networks

馃AIContent type: Academic
arxiv.org

When Do Fewer Coordinates Suffice in DP-SGD?

馃Machine LearningContent type: Academic
arxiv.org

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help