🎯 Post-training - samveed · Scour

Emergence of Context Characteristics Sensitivity in Large Language Models

🌐World Models Academic

Reasoning RL in 2026: GRPO, DPO, RLVR, Agentic PO & Beyond

turingpost.com·

[NEW MODEL] SupraLabs just released Supra1.5-50M Base (Experimental)!

🏋️Pretraining

huggingface.co··r/LocalLLaMA

Tracing Eval-Awareness Emergence Through Training of OLMo 3

🏋️Pretraining

lesswrong.com·

The week AI infrastructure crossed from a technology story to a financial one

💬LLMs News

KJLdefeated/RL.cu: RLVR training for LLM in CUDA/C++

📊ML Code

github.com··Hacker News

Why LLMs (still) lack taste

beyondtheprior.com··Hacker News

Less-relevant results

Don't let the LLM speak, just probe it (8 minute read)

🧠AI Blog

Vibe Diaries: Training Nanochat

vibediary.dev··Hacker News

SFT & the Locus Awards

sfintranslation.com·

Researchers trained an open source AI search agent, Harness-1, that outperforms GPT-5.4 on recalling relevant information

🌐World Models

venturebeat.com··Hacker News

Compatibility-Aware Dynamic Fine-Tuning for Large Language Models

🎮RL Academic

DiffusionGemma: The Developer Guide- Google Developers Blog

💬LLMs Blog

developers.googleblog.com··r/LocalLLaMA

I built a machine that turns AI papers into interactive explainers

🎮RL Blog

GPT-2: Too Dangerous To Release (2019)

💬LLMs Blog

naokishibuya.github.io··Hacker News

MoQ GGUFs and GSQ: Low-Bit GGUFs Are About to Get Much Better

💬LLMs News Blog

kaitchup.substack.com··r/LocalLLaMA

How to reduce capability degradation from off-model SFT

lesswrong.com·

SLUUG Talk: Demystifying Large Language Models on Linux

🧠AI Code

github.com··DEV

Introducing North Mini Code: Cohere’s First Model For Developers

🌐World Models Blog

huggingface.co··Hacker News

Architecture-Aware Reinforcement Learning Makes Sliding-Window Attention Competitive in Math Reasoning

🏋️Pretraining Academic

Log in to enable infinite scrolling