Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🎮 Reinforcement Learning
Specific
RL, reward functions, policy gradient, RLHF
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
151352
posts in
10.8
ms
Target Policy Optimization
♟️
Game Theory
arxiv.org
·
2d
Continual
learning for AI agents
🤖
AI
bestblogs.dev
·
4d
Trustworthy
agents in
practice
♟️
Game Theory
anthropic.com
·
23h
Model
organisms
researchers should check whether high
LRs
defeat their model
organisms
🤖
AI
lesswrong.com
·
16h
Rethinking
Robotics Reinforcement Learning: A Practical
Humanoid
Training Workflow
🤖
AI
semiengineering.com
·
1d
Markov
Decision
Processes
: The Language of Reinforcement Learning
♟️
Game Theory
medium.com
·
5d
How HN: We were wrong about AI
capability
floors
(and why smart triggers matter)
🤖
AI
zenodo.org
·
14h
·
Hacker News
Formalizing
the "generative crash" via
inverse
reinforcement learning
♟️
Game Theory
news.ycombinator.com
·
2d
·
Hacker News
Show HN: Agent Tuning, using
recursion
to achieve
predictable
agent output
💻
Computer Science
github.com
·
18h
·
Hacker News
The Dark Factory
Harness
: Turning Autonomous
Hill-Climbing
into Autonomous Research
🤖
AI
sotaverified.org
·
2d
·
Hacker News
Reinforcement
Learning From Human Feedback (
RLHF
) in Large Language Models(LLMs)
♟️
Game Theory
pub.towardsai.net
·
6d
Hyperparameter
optimization impact and tuning guidelines for decentralized multi-agent reinforcement learning in multi-energy
neighborhoods
♟️
Game Theory
sciencedirect.com
·
2d
Three Ways
Machines
Learn
🤖
AI
medium.com
·
3d
Neural
circuits
encode
prior knowledge of temporal statistics
⚛️
Quantum Computing
nature.com
·
2d
Autonomous
Rocket
Landing
with Reinforcement Learning (YouTube)
🤖
AI
youtube.com
·
1d
·
Hacker News
Google
DeepMind
's Research Lets an LLM Rewrite Its Own Game Theory Algorithms — And It
Outperformed
the Experts
♟️
Game Theory
marktechpost.com
·
6d
·
r/singularity
Better
Harness
: A Recipe for
Harness
Hill-Climbing with
Evals
🤖
AI
blog.langchain.com
·
1d
ALTK
‑
Evolve
: On‑the‑Job Learning for AI Agents
🤖
AI
huggingface.co
·
2d
·
Hacker News
Tavily
🤖
AI
tavily.com
·
5d
My AI Learning
Journey
– Part 4
🤖
AI
blog.wirelessmoves.com
·
2d
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help