Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🎮 Reinforcement Learning
RLHF, Policy Gradient, Reward Models, Agent Training
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
171082
posts in
39.2
ms
Introduction
to Reinforcement Learning Agents with the
Unity
Game Engine
🎯
RLHF
towardsdatascience.com
·
3d
Catching up with RL:
TUD
Lecture
on RL #1
🎯
RLHF
sebiwette.de
·
1d
A problem with
perfectly
rational
agents and decision theory
♟️
Game Theory
alexanderpruss.blogspot.com
·
13h
How AI Actually Learns to Be Helpful: The Math Behind
RLHF
and
DPO
That Nobody Shows You
🤖
AI Tools
pub.towardsai.net
·
2d
Rethinking
Robotics Reinforcement Learning: A Practical
Humanoid
Training Workflow
🎯
RLHF
semiengineering.com
·
5d
The
Advisor
Strategy: Making Cost-Quality
Tradeoff
in AI Agents
🎯
Agentic AI Red Teaming
medium.com
·
2d
Artificial Intelligence
Paradigms
— From
Symbolic
AI to Deep Learning and Agent-Based Intelligence
🕵️
AI Agents
medium.com
·
2d
Multi-agent
RL
tooling
🕵️
AI Agents
danmackinlay.name
·
5d
How to build effective reward functions with AWS
Lambda
for Amazon Nova model
customization
⚡
Serverless
aws.amazon.com
·
1d
ALTK
‑
Evolve
: On‑the‑Job Learning for AI Agents
💾
Agent Memory
huggingface.co
·
6d
·
Hacker News
,
Hacker News
When Should AI Step
Aside
?: Teaching Agents When Humans Want to
Intervene
🎼
Agent Orchestration
blog.ml.cmu.edu
·
1d
Show HN:
HyperFlow
– A self-improving agent framework built on
LangGraph
🕸️
LangGraph
news.ycombinator.com
·
4d
·
Hacker News
Trustworthy
agents in
practice
📋
AGENTS.md
anthropic.com
·
5d
Reinforcement Learning / Q Learning Basics with
Tic
Tac
Toe
♟️
Game Theory
github.com
·
3d
·
DEV
How HN: We were wrong about AI
capability
floors
(and why smart triggers matter)
✍️
Prompt Engineering
zenodo.org
·
5d
·
Hacker News
How
Curosr
Trains Agentic Models with
RL
🧠
Context Engineering
pub.towardsai.net
·
2d
anakin87/llm-rl-environments-lil-course
: 🌱 A little course on Reinforcement Learning Environments for evaluating and training Language Models
🧠
Context Engineering
github.com
·
3d
·
Hacker News
Model
organisms
researchers should check whether high
LRs
defeat their model
organisms
✍️
Prompt Engineering
lesswrong.com
·
5d
The
Beginner
’s Guide to AI Agents: What They Are, How They Work, and Where to Start
🕵️
AI Agents
medium.com
·
4d
Better
Harness
: A Recipe for
Harness
Hill-Climbing with
Evals
🛡️
AI Safety Evals
blog.langchain.com
·
6d
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help