Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🎮 Reinforcement Learning
Specific
RL, reward function, policy gradient, Q-learning, OpenAI Gym
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
186592
posts in
19.0
ms
Here's my AI time management system —
copy
and
paste
this into Claude
Â
🤖
AI
chrisbailey.com
·
13h
Autonomous
payments
between Agents using
L402
? [video]
Â
🤖
AI
youtube.com
·
53m
·
Hacker News
Your Old-School Process
Skills
Are a
Superpower
for Building AI Agents
Â
🤖
AI
asianefficiency.com
·
4h
Three
principles
for AI Agent
Configuration
Â
🤖
AI
ministryoftesting.com
·
2d
RL
, in
pictures
and videos
Â
🤖
AI
suriya.cc
·
6d
Inside Claude Code, OpenAI
Codex
, and
HuggingFace
's ML Engineer Agent
Â
ðŸ§
LLMs
newsletter.artofsaience.com
·
11h
New Content From <i>Current
Directions
in
Psychological
Science</i>
Â
ðŸ§
LLMs
psychologicalscience.org
·
10h
Effective Personalized AI Tutors via LLM-Guided Reinforcement Learning by Angel Tsai-Hsuan Chung,
Botong
Zhang, Ling-Chieh Kung, Hamsa Bastani,
Osbert
Bastani :...
Â
ðŸ§
LLMs
papers.ssrn.com
·
1d
How long is your
loop
?
Â
🤖
AI
webdirections.org
·
52m
caiovicentino1/qwen36-27b-sae-papergrade
Â
ðŸ§
LLMs
huggingface.co
·
4h
·
Hacker News
End of black box AI? Scientists develop blueprint for
transparent
system that reveals how it
learns
and makes decisions
Â
🤖
AI
techxplore.com
·
7h
Unlocking
human
ambition
to drive business growth with AI
Â
🤖
AI
blogs.microsoft.com
·
2d
Reddit as a Reinforcement Learning
Gym
for
Persuasion
Â
ðŸ§
LLMs
thediff.co
·
6d
Every Model Learned by Gradient
Descent
Is
Approximately
a Kernel Machine
Â
ðŸ§
LLMs
news.ycombinator.com
·
49m
·
Hacker News
Alibaba's
Metis
agent cuts
redundant
AI tool calls from 98% to 2% — and gets more accurate doing it
Â
🤖
AI
venturebeat.com
·
4h
Lyapunov-Guided
Self-Alignment: Test-Time Adaptation for
Offline
Safe Reinforcement Learning
Â
ðŸ§
LLMs
arxiv.org
·
21h
context-labs/HALO:
Hierarchal
Agent Loop
Optimizer
Â
ðŸ§
LLMs
github.com
·
1d
·
Hacker News
Adaptive home energy management to
self-motivated
user
preferences
via iterative LLM-augmented reinforcement learning
Â
ðŸ§
LLMs
sciencedirect.com
·
5d
Long-running Agents
Â
🤖
AI
addyo.substack.com
·
10h
·
Substack
AI Dementia—Why Your Agent Gets
Progressively
Dumber
As You Talk To It
Â
🤖
AI
weightythoughts.com
·
1d
« Page 1
·
Page 3 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help