🧠 LLM Training - spearous · Scour

RL Excursions during Pre-Training: Re-examining Policy Optimization for LLM training

🤖AI Academic

Tracing Eval-Awareness Emergence Through Training of OLMo 3

lesswrong.com·

brunokeymolen/lora: LoRa (Long Range) communication related projects

📡SETI Code

github.com··Hacker News

Machine learning from scratch, what to build before using scikit-learn

🤖Transformers Tutorial

iwtlp.com··DEV

SecLoRA: Secure Aggregation of Low-Rank Matrix Products via Functional Encryption

eprint.iacr.org·

Reasoning RL in 2026: GRPO, DPO, RLVR, Agentic PO & Beyond

turingpost.com·

Arcane style - Ideogram 4.0 LORA - Experimental

🤖Transformers

huggingface.co··r/StableDiffusion

Less-relevant results

Generalizable self-supervised learning for imaging flow cytometry on multi-dataset leukocyte differential

🤖AI Academic

Backpropagation Without the Magic: A First-Principles Derivation

🤖AI Blog

·

Fine tuning classification in Elixir

elixirstatus.com·

A new chapter of efficient foundation models for medical imaging

techcommunity.microsoft.com

·

The Era of System 2 AI

🤖AI Blog

·

Train Models Faster with JAX and MaxText Using NVFP4 on NVIDIA Blackwell

🧩LeetCode News Blog

developer.nvidia.com·

GPT-2: Too Dangerous To Release (2019)

🤖Transformers Blog

naokishibuya.github.io··Hacker News

If Claude Fable stops helping you, you'll never know

🤖AI Blog

jonready.com··Lobsters, Hacker News

Some Interesting Papers on RLVR

lesswrong.com·

Evolution of crystal field and intra-ionic interactions in ilmenite $A{\mathrm{IrO}}_{3}$ ($A=\mathrm{Mg}$, Zn, Cd) and hyperhoneycomb $β\text{−}{\mathrm{ZnIrO}...

New comment by perturbation in "Ask HN: Who wants to be hired? (June 2026)"

drive.google.com··Hacker News

When RL Fails after SFT: Rejuvenating Model Plasticity for Robust SFT-to-RL Handoff

🤖AI Academic

SuperBase Adds GPS, Compass, and OLED Display to Meshtastic

linuxgizmos.com·

Log in to enable infinite scrolling