Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Reinforcement Learning
🎮 Reinforcement Learning
Specific
RL, reward functions, policy gradient, RLHF
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
388
posts in
7.3
ms
Memoirs of a
Learning
Machine
: Autobiographical Self-Training and the Self-Training Gap
🧠
AI Agents
zenodo.org
·
4d
4 days ago
·
Hacker News
Actions for Memoirs of a Learning Machine: Autobiographical Self-Training and the Self-Training Gap
Less-relevant results
Propel: Breaking the Solver Bottleneck in Task-Generator
RL
💬
LLMs
vmax.ai
·
16h
16 hours ago
·
Hacker News
Actions for Propel: Breaking the Solver Bottleneck in Task-Generator RL
Deterministic
Policy
Gradient
for
Learning
Equilibrium in Time-Inconsistent Control Problems
⚡
Concurrency
Content type:
Academic
arxiv.org
·
10h
10 hours ago
Actions for Deterministic Policy Gradient for Learning Equilibrium in Time-Inconsistent Control Problems
Beyond the Buzzwords: The Definitive Guide to Navigating the AI vs.
Machine
Learning
Divide
🤖
Machine Learning
Content type:
News
Content type:
Blog
aiacademy01.blogspot.com
·
5d
5 days ago
Actions for Beyond the Buzzwords: The Definitive Guide to Navigating the AI vs. Machine Learning Divide
Reinforcement-learning
signals support dynamic adaptive control during language switching
🏗️
AI Infrastructure
Content type:
Academic
nature.com
·
1d
1 day ago
Actions for Reinforcement-learning signals support dynamic adaptive control during language switching
Social intelligence Arises Between Minds
🧠
AI Agents
psychologytoday.com
·
3d
3 days ago
Actions for Social intelligence Arises Between Minds
Cohere open-sources a coding
agent
that runs on a single H100
🤖
AI
venturebeat.com
·
1d
1 day ago
Actions for Cohere open-sources a coding agent that runs on a single H100
Robots are closing in on human-like judgments, addressing a key challenge in physical AI
🧠
AI Agents
techxplore.com
·
20h
20 hours ago
Actions for Robots are closing in on human-like judgments, addressing a key challenge in physical AI
Microsoft just shared the frontier data engineering secrets
🤖
AI
mail.bycloud.ai
·
1d
1 day ago
Actions for Microsoft just shared the frontier data engineering secrets
How to Train Your Goblin
🤖
AI
goblins.mchen.workers.dev
·
4d
4 days ago
·
Hacker News
,
Hacker News
Actions for How to Train Your Goblin
Phi-Actor-Critic: Steering General-Sum Games to Pareto-Efficient Correlated Equilibria
🧠
AI Agents
Content type:
Academic
arxiv.org
·
10h
10 hours ago
Actions for Phi-Actor-Critic: Steering General-Sum Games to Pareto-Efficient Correlated Equilibria
Some Interesting Papers on RLVR
💬
LLMs
lesswrong.com
·
1d
1 day ago
Actions for Some Interesting Papers on RLVR
Breaking free of a single datacenter: Practical geo-distributed AI operations with the k0smos platforms
🏗️
AI Infrastructure
Content type:
Blog
cncf.io
·
3d
3 days ago
Actions for Breaking free of a single datacenter: Practical geo-distributed AI operations with the k0smos platforms
[NEW MODEL] SupraLabs just released Supra1.5-50M Base (Experimental)!
🤖
AI
huggingface.co
·
2h
2 hours ago
·
r/LocalLLaMA
Actions for [NEW MODEL] SupraLabs just released Supra1.5-50M Base (Experimental)!
Core Automation co-founder Jerry Tworek jokes that Nvidia's CUDA translates to miracles in Polish
💻
Tech Industry
digg.com
·
6d
6 days ago
Actions for Core Automation co-founder Jerry Tworek jokes that Nvidia's CUDA translates to miracles in Polish
Improving Generalization and Data Efficiency with Diffusion in Offline
Multi-agent
RL
🧠
AI Agents
Content type:
Academic
arxiv.org
·
10h
10 hours ago
Actions for Improving Generalization and Data Efficiency with Diffusion in Offline Multi-agent RL
Major Types of
Machine
Learning
🤖
Machine Learning
Content type:
Blog
medium.com
·
6d
6 days ago
Actions for Major Types of Machine Learning
Test Your Skills Against an AI Air Hockey Robot
🧠
AI Agents
Content type:
News
hackster.io
·
6d
6 days ago
Actions for Test Your Skills Against an AI Air Hockey Robot
Weekly Research Recap
🏗️
AI Infrastructure
Content type:
News
quantseeker.com
·
1d
1 day ago
Actions for Weekly Research Recap
Verifiable
Environments
Are LEGO Bricks: Recursive Composition for Reasoning Generalization
🏗️
AI Infrastructure
Content type:
Academic
arxiv.org
·
10h
10 hours ago
Actions for Verifiable Environments Are LEGO Bricks: Recursive Composition for Reasoning Generalization
« Page 1
·
Page 3 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help