Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Reinforcement Learning
🎮 Reinforcement Learning
RL, AI Agents, Game Playing, Policy Optimization
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
431
posts in
7.0
ms
What is MBPO? A Beginner’s Guide to Efficient
Reinforcement
Learning
🔗
MCP
Content type:
Blog
ujangriswanto08.medium.com
·
5d
5 days ago
Actions for What is MBPO? A Beginner’s Guide to Efficient Reinforcement Learning
Why LLMs (still) lack taste
🤖
LLM
beyondtheprior.com
·
2d
2 days ago
·
Hacker News
Actions for Why LLMs (still) lack taste
Fast and Highly Expressive
Policy
Learning
for Offline
Reinforcement
Learning
via Bootstrapped Flow
Q-Learning
🎛️
Fine-tuning
Content type:
Academic
arxiv.org
·
22h
22 hours ago
Actions for Fast and Highly Expressive Policy Learning for Offline Reinforcement Learning via Bootstrapped Flow Q-Learning
Nvidia Nemotron 3 Ultra
🎛️
Fine-tuning
research.nvidia.com
·
6d
6 days ago
·
Hacker News
Actions for Nvidia Nemotron 3 Ultra
Value representation in youth psychopathology: evidence of a transdiagnostic risk mechanism for psychosis
📖
Narratology
Content type:
Academic
nature.com
·
1d
1 day ago
Actions for Value representation in youth psychopathology: evidence of a transdiagnostic risk mechanism for psychosis
Google
DeepMind
's Susan Zhang argues abundant
AI
content shifts the premium from raw intelligence to human relationships and social dynamics
🤖
LLM
Content type:
News
digg.com
·
3d
3 days ago
Actions for Google DeepMind's Susan Zhang argues abundant AI content shifts the premium from raw intelligence to human relationships and social dynamics
Microsoft just shared the frontier data engineering secrets
🎛️
Fine-tuning
mail.bycloud.ai
·
1d
1 day ago
Actions for Microsoft just shared the frontier data engineering secrets
Memoirs of a
Learning
Machine: Autobiographical Self-Training and the Self-Training Gap
🤝
AI Agents
zenodo.org
·
4d
4 days ago
·
Hacker News
Actions for Memoirs of a Learning Machine: Autobiographical Self-Training and the Self-Training Gap
Test-Time Gradient Guidance of Flow
Policies
in
Reinforcement
Learning
🎛️
Fine-tuning
Content type:
Academic
arxiv.org
·
22h
22 hours ago
Actions for Test-Time Gradient Guidance of Flow Policies in Reinforcement Learning
How
AI
chatbots become better
learning
coaches
🤖
LLM
techxplore.com
·
1h
1 hour ago
Actions for How AI chatbots become better learning coaches
A wild idea: Abstract reality using ontology
✍️
Prompt Engineering
Content type:
Discussion
news.ycombinator.com
·
4d
4 days ago
·
Hacker News
Actions for A wild idea: Abstract reality using ontology
Social intelligence Arises Between Minds
🤖
Agentic AI
psychologytoday.com
·
3d
3 days ago
Actions for Social intelligence Arises Between Minds
Are Classical Machine
Learning
Jobs Dying?
🧠
Machine Learning
Content type:
Blog
medium.com
·
2d
2 days ago
Actions for Are Classical Machine Learning Jobs Dying?
SocraticPO:
Policy
Optimization
via Interactive Guidance
🤖
Agentic AI
Content type:
Academic
arxiv.org
·
22h
22 hours ago
Actions for SocraticPO: Policy Optimization via Interactive Guidance
AI
Innovations: The New Frontier of
Decision-Making
and Security
🧠
Machine Learning
Content type:
Blog
medium.com
·
2d
2 days ago
Actions for AI Innovations: The New Frontier of Decision-Making and Security
See,
Act
, Correct: three levers for working with a code
agent
🤝
AI Agents
Content type:
Blog
blog.owulveryck.info
·
6d
6 days ago
·
Hacker News
,
Hacker News
Actions for See, Act, Correct: three levers for working with a code agent
Breaking free of a single datacenter: Practical geo-distributed
AI
operations with the k0smos platforms
🧠
Deep Learning
Content type:
Blog
cncf.io
·
2d
2 days ago
Actions for Breaking free of a single datacenter: Practical geo-distributed AI operations with the k0smos platforms
Bridging Multi-Vector and
Learned-Sparse
Retrieval, A Diagnostic Framework for Robust Semantic IDs, and More!
🔥
PyTorch
Content type:
News
Content type:
Blog
recsys.substack.com
·
5d
5 days ago
·
Substack
Actions for Bridging Multi-Vector and Learned-Sparse Retrieval, A Diagnostic Framework for Robust Semantic IDs, and More!
Model predictive task sampling for efficient and robust adaptation
🎛️
Fine-tuning
Content type:
Academic
nature.com
·
2d
2 days ago
Actions for Model predictive task sampling for efficient and robust adaptation
Experts weigh in on Anthropic’s Fable 5, Mythos 5 releases
🤝
AI Agents
sdtimes.com
·
1d
1 day ago
Actions for Experts weigh in on Anthropic’s Fable 5, Mythos 5 releases
Sign up or log in to see more results
Sign Up
Login
« Page 2
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help