Reinforcement Learning from Human Feedback

Feeds to Scour
SubscribedAll
Scoured 510 posts in 8.8 ms

Reasoning RL in 2026: GRPO, DPO, RLVR, Agentic PO & Beyond

 🎮RL
turingpost.com·

Variational Proximal Policy Optimization

 🎮RL  Content type: Academic
arxiv.org·
Less-relevant results

APOSM: Pairwise preference learning improves generative small-molecule design

 🧠Deep Learning  Content type: Academic
biorxiv.org·

Fine tuning classification in Elixir

 🧠Deep Learning
elixirstatus.com·

(Mis)generalization of Helpful-Only Fine-tuning

 🎮RL
lesswrong.com·

A new chapter of efficient foundation models for medical imaging

 ⚙️ML Systems

Evolution of crystal field and intra-ionic interactions in ilmenite $A{\mathrm{IrO}}_{3}$ ($A=\mathrm{Mg}$, Zn, Cd) and hyperhoneycomb $β\text{−}{\mathrm{ZnIrO}...

 🌐Distributed Systems
link.aps.org·

magenta/magenta-realtime: Magenta RealTime 2: An Open-Weights Live Music Model

 🤖ML  Content type: Code
github.com·

The Enormous Potential For Microsoft Frontier Fine Tuning

 Performance
joshbersin.com·

Why LLMs (still) lack taste

 🎮RL

Mult-DPO: Multinomial Direct Preference Optimization for Recommender Systems

 🎮RL  Content type: Academic
arxiv.org·

Nvidia Nemotron 3 Ultra

 🎮RL

Med Tech Gurus: Why Most Radiology AI Fails

 🧠Deep Learning  Content type: Audio
med-tech-gurus.libsyn.com·

Vibe Diaries: Training Nanochat

 🤖ML
vibediary.dev··Hacker News

Domain-Specific Small Language Models (Manning)

 🤖ML
i-programmer.info·

Ideogram 4.0 launches with 2K resolution and top open-weight ranking

 ⚙️ML Systems
alternativeto.net·

SecLoRA: Secure Aggregation of Low-Rank Matrix Products via Functional Encryption

 🌐Distributed Systems
eprint.iacr.org·

Adaptive Spo11 RNA editing gate optimizes meiosis I pace and mitotic proliferation while preserving ascospore formation

 🛠️Systems Programming
science.org·

New comment by perturbation in "Ask HN: Who wants to be hired? (June 2026)"

 🤖ML

GPT-2: Too Dangerous To Release (2019)

 🧠Deep Learning  Content type: Blog

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help