🎯 Post-Training - touyou · Scour

Sequential Data Poisoning in LLM Post-Training

🤖LLM Inference Academic

Why LLMs (still) lack taste

🤖LLM Inference

beyondtheprior.com··Hacker News

Tracing Eval-Awareness Emergence Through Training of OLMo 3

👁️Multimodal LLMs

lesswrong.com·

GPT-2: Too Dangerous To Release (2019)

👁️Multimodal LLMs Blog

naokishibuya.github.io··Hacker News

Nvidia Nemotron 3 Ultra

🤖LLM Inference

research.nvidia.com··Hacker News

Less-relevant results

DiffusionGemma: The Developer Guide- Google Developers Blog

⚙️AI Infrastructure Blog

developers.googleblog.com··r/LocalLLaMA

Reasoning RL in 2026: GRPO, DPO, RLVR, Agentic PO & Beyond

🔄Agentic Systems

turingpost.com·

Vibe Diaries: Training Nanochat

👁️Multimodal LLMs

vibediary.dev··Hacker News

magenta/magenta-realtime: Magenta RealTime 2: An Open-Weights Live Music Model

👁️Multimodal LLMs Code

Deep Learning Weekly: Issue 458

🔄Agentic Systems

deeplearningweekly.com·

Researchers trained an open source AI search agent, Harness-1, that outperforms GPT-5.4 on recalling relevant information

🔍Retrieval-Augmented Generation

venturebeat.com··Hacker News

When RL Fails after SFT: Rejuvenating Model Plasticity for Robust SFT-to-RL Handoff

🤖LLM Inference Academic

Show HN: The Deterministic Core Architecture for AI-Augmented Applications

🤖LLM Inference

brandonbellsystems.com··Hacker News

How to reduce capability degradation from off-model SFT

🤖LLM Inference

lesswrong.com·

SFT & the Locus Awards

🔍Retrieval-Augmented Generation

sfintranslation.com·

Introducing the Third Generation of Apple’s Foundation Models

👁️Multimodal LLMs

machinelearning.apple.com··Hacker News, r/apple

Why Claude Produces High-Quality Output: A Developer’s Guide to Token Efficiency and Hallucination…

🔍Retrieval-Augmented Generation Blog

A Unifying Lens on Supervised Fine-Tuning Through Target Distribution Design

🤖LLM Inference Academic

NVIDIA Nemotron 3 Ultra Powers Faster, More Efficient Reasoning for Long-Running Agents

🔄Agentic Systems Blog

developer.nvidia.com··Hacker News

Posting for authoring

🔄Agentic Systems

turingpost.com·

Log in to enable infinite scrolling