⛰️ Gradient Descent - pfh · Scour

Gradient Compression May Hurt Generalization: A Remedy by Synthetic Data Guided Sharpness Aware Minimization

arxiv.org·21h

📐Matrix Factorization

Unifying Stable Optimization and Reference Regularization in RLHF

arxiv.org·21h

🎪Convex Optimization

Possible identification of the Luna 9 Moon landing site using a novel machine learning algorithm

nature.com·5h·

Discuss: Hacker News

Optimization of interpretable hydropower reservoir operation rules by denoising diffusion probabilistic model, parallel chaotic cooperation search algorithm and...

sciencedirect.com·8h

🔗Markov Chains

Ai’s Inner Workings Revealed By Model Trained On One Billion Data Points

quantumzeitgeist.com·1d

🗺️Manifold Learning

THUDM/slime: slime is an LLM post-training framework for RL Scaling.

github.com·1h

🦠Whole cell model

LateOn-Code & ColGrep: LightOn unveils state-of-the-art code retrieval models and code search tooling

huggingface.co·1d·

Discuss: Hacker News

How Andrej Karpathy Built a Working Transformer in 243 Lines of Code

analyticsvidhya.com·1d

BalatroBench Benchmarks Large Language Models Playing Balatro

balatrobench.com·15h·

Discuss: Hacker News

Building a Robust Classifier with Stacked Generalization

dev.to·3d·

Discuss: DEV

Diffusion Models for ARC-AGI: A Retrospective

christopherhwood.com·2d·

Discuss: Hacker News

EyesOff: Why Some Models Quantize Better Than Others

ym2132.github.io·2d·

Discuss: Hacker News

Addendum: Data splitting against information leakage with DataSAIL

nature.com·13h

New Generative Paradigm: Drifting Model

mail.bycloud.ai·3d

📊Empirical Bayes

Wahba’s Problem and SO(3) Optimization: Rotation Learning in Geometric ML

hackernoon.com·2d

🎪Convex Optimization

Space Alignment Matters: The Missing Piece for Inducing Neural Collapse in Long-Tailed Learning

sonomarpa.sonoma.lib.ca.us·1d

🗺️Manifold Learning

GPU-Serving Two-Tower Models for Lightweight Ads Engagement Prediction

medium.com·2h

Olmix: A framework for data mixing throughout LM development

allenai.org·10h

📐Computational Geometry

Recursive Language Models: Stop Stuffing the Context Window

nlp.elvissaravia.com·1d

📊Empirical Bayes

Active learning Kriging with functional dimension reduction for reliability analysis of stochastic dynamical systems

sciencedirect.com·1d

🔗Markov Chains

Loading more...