Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Post-training
🎯 Post-training
Specific
fine-tuning, RLHF, instruction tuning, alignment
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
154
posts in
9.4
ms
Emergence of Context Characteristics Sensitivity in
Large
Language
Models
🌐
World Models
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for Emergence of Context Characteristics Sensitivity in Large Language Models
Reasoning
RL
in 2026: GRPO,
DPO
, RLVR, Agentic
PO
& Beyond
🎮
RL
turingpost.com
·
4d
4 days ago
Actions for Reasoning RL in 2026: GRPO, DPO, RLVR, Agentic PO & Beyond
[NEW
MODEL
] SupraLabs just released Supra1.5-50M Base (Experimental)!
🏋️
Pretraining
huggingface.co
·
3h
3 hours ago
·
r/LocalLLaMA
Actions for [NEW MODEL] SupraLabs just released Supra1.5-50M Base (Experimental)!
Tracing Eval-Awareness Emergence Through
Training
of OLMo 3
🏋️
Pretraining
lesswrong.com
·
1d
1 day ago
Actions for Tracing Eval-Awareness Emergence Through Training of OLMo 3
The week AI infrastructure crossed from a technology story to a
financial
one
💬
LLMs
Content type:
News
mlwhiz.com
·
16h
16 hours ago
Actions for The week AI infrastructure crossed from a technology story to a financial one
KJLdefeated/RL.cu
: RLVR
training
for
LLM
in CUDA/C++
📊
ML
Content type:
Code
github.com
·
4d
4 days ago
·
Hacker News
Actions for KJLdefeated/RL.cu: RLVR training for LLM in CUDA/C++
Why LLMs (still) lack taste
💬
LLMs
beyondtheprior.com
·
2d
2 days ago
·
Hacker News
Actions for Why LLMs (still) lack taste
Less-relevant results
Don't let the
LLM
speak, just probe it (8 minute read)
🧠
AI
Content type:
Blog
blog.j11y.io
·
16h
16 hours ago
Actions for Don't let the LLM speak, just probe it (8 minute read)
Vibe Diaries:
Training
Nanochat
📊
ML
vibediary.dev
·
2d
2 days ago
·
Hacker News
Actions for Vibe Diaries: Training Nanochat
SFT
& the Locus Awards
💬
LLMs
sfintranslation.com
·
6d
6 days ago
Actions for SFT & the Locus Awards
Researchers
trained
an open source AI search agent, Harness-1, that outperforms GPT-5.4 on recalling relevant information
🌐
World Models
venturebeat.com
·
2d
2 days ago
·
Hacker News
Actions for Researchers trained an open source AI search agent, Harness-1, that outperforms GPT-5.4 on recalling relevant information
Compatibility-Aware Dynamic
Fine-Tuning
for
Large
Language Models
🎮
RL
Content type:
Academic
arxiv.org
·
12h
12 hours ago
Actions for Compatibility-Aware Dynamic Fine-Tuning for Large Language Models
DiffusionGemma: The Developer Guide- Google Developers Blog
💬
LLMs
Content type:
Blog
developers.googleblog.com
·
1d
1 day ago
·
r/LocalLLaMA
Actions for DiffusionGemma: The Developer Guide- Google Developers Blog
I built a machine that turns AI papers into interactive explainers
🎮
RL
Content type:
Blog
blog.skz.dev
·
6d
6 days ago
Actions for I built a machine that turns AI papers into interactive explainers
GPT-2: Too Dangerous To Release (2019)
💬
LLMs
Content type:
Blog
naokishibuya.github.io
·
1d
1 day ago
·
Hacker News
Actions for GPT-2: Too Dangerous To Release (2019)
MoQ GGUFs and GSQ: Low-Bit GGUFs Are About to Get Much Better
💬
LLMs
Content type:
News
Content type:
Blog
kaitchup.substack.com
·
5d
5 days ago
·
r/LocalLLaMA
Actions for MoQ GGUFs and GSQ: Low-Bit GGUFs Are About to Get Much Better
How to reduce capability degradation from
off-model
SFT
💬
LLMs
lesswrong.com
·
2d
2 days ago
Actions for How to reduce capability degradation from off-model SFT
SLUUG Talk: Demystifying
Large
Language
Models
on Linux
🧠
AI
Content type:
Code
github.com
·
4d
4 days ago
·
DEV
Actions for SLUUG Talk: Demystifying Large Language Models on Linux
Introducing North Mini Code: Cohere’s First
Model
For Developers
🌐
World Models
Content type:
Blog
huggingface.co
·
2d
2 days ago
·
Hacker News
Actions for Introducing North Mini Code: Cohere’s First Model For Developers
Architecture-Aware Reinforcement Learning Makes Sliding-Window Attention Competitive in Math Reasoning
🏋️
Pretraining
Content type:
Academic
arxiv.org
·
12h
12 hours ago
Actions for Architecture-Aware Reinforcement Learning Makes Sliding-Window Attention Competitive in Math Reasoning
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help