Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
RL
🎮 RL
Specific
reinforcement learning, reward modeling, policy gradient
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
178
posts in
9.7
ms
Geometrically Averaged Hard Target Updates for Linear
Q-Learning
🌐
World Models
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Geometrically Averaged Hard Target Updates for Linear Q-Learning
Less-relevant results
Don't let the LLM speak, just probe it (8 minute read)
🧠
AI
Content type:
Blog
blog.j11y.io
·
17h
17 hours ago
Actions for Don't let the LLM speak, just probe it (8 minute read)
Test Your Skills Against an AI Air Hockey Robot
🌐
World Models
Content type:
News
hackster.io
·
6d
6 days ago
Actions for Test Your Skills Against an AI Air Hockey Robot
BMW M’s Mystery
Le
Mans Reveal May Preview The Electric M3
📊
ML
Content type:
Blog
autoblog.com
·
2d
2 days ago
Actions for BMW M’s Mystery Le Mans Reveal May Preview The Electric M3
TT-DAC-PS: Twin-Target Deterministic
Actor-Critic
with
Policy
Smoothing for Optimal Trade Execution
🌐
World Models
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for TT-DAC-PS: Twin-Target Deterministic Actor-Critic with Policy Smoothing for Optimal Trade Execution
I built a machine that turns AI papers into interactive explainers
🎯
Post-training
Content type:
Blog
blog.skz.dev
·
6d
6 days ago
Actions for I built a machine that turns AI papers into interactive explainers
Understanding your paycheck in Workday
🧠
AI
Content type:
Academic
news.clemson.edu
·
1d
1 day ago
Actions for Understanding your paycheck in Workday
‘I don’t want my children to grow up in a broken family’: Abused husbands in S’pore who are unseen
🧩
Behavioral Economics
straitstimes.com
·
4d
4 days ago
·
r/singapore
Actions for ‘I don’t want my children to grow up in a broken family’: Abused husbands in S’pore who are unseen
Improving Generalization and Data Efficiency with Diffusion in Offline
Multi-agent
RL
🌐
World Models
Content type:
Academic
arxiv.org
·
13h
13 hours ago
Actions for Improving Generalization and Data Efficiency with Diffusion in Offline Multi-agent RL
Fast and Highly Expressive
Policy
Learning
for Offline
Reinforcement
Learning
via Bootstrapped Flow
Q-Learning
🌐
World Models
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Fast and Highly Expressive Policy Learning for Offline Reinforcement Learning via Bootstrapped Flow Q-Learning
Hyundai’s Budget i20 Hatch Has The Lamborghini Revuelto’s Lights
💬
LLMs
Content type:
News
carscoops.com
·
6d
6 days ago
Actions for Hyundai’s Budget i20 Hatch Has The Lamborghini Revuelto’s Lights
Europe’s 2027 Mazda CX-30 Has A Manual And Better Headlights Than Yours
🎯
Post-training
carscoops.com
·
1d
1 day ago
Actions for Europe’s 2027 Mazda CX-30 Has A Manual And Better Headlights Than Yours
SLUUG Talk: Demystifying Large Language
Models
on Linux
🧠
AI
Content type:
Code
github.com
·
4d
4 days ago
·
DEV
Actions for SLUUG Talk: Demystifying Large Language Models on Linux
The Neutral Mask: How
RLHF
Provides Shallow Alignment while Leaving Partisan Structure Intact in a Large Language
Model
🧠
AI
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for The Neutral Mask: How RLHF Provides Shallow Alignment while Leaving Partisan Structure Intact in a Large Language Model
Inside star footballer Cristiano Ronaldo’s ₹1,000+ crore real estate empire: Mansions, penthouses, and luxury villas across the globe
🧩
Behavioral Economics
Content type:
News
timesofindia.indiatimes.com
·
2d
2 days ago
Actions for Inside star footballer Cristiano Ronaldo’s ₹1,000+ crore real estate empire: Mansions, penthouses, and luxury villas across the globe
My father went to war with a demon bird
💬
LLMs
Content type:
News
unherd.com
·
18h
18 hours ago
·
unherd.com
Actions for My father went to war with a demon bird
Neglected Basics of AI Alignment
🧠
AI
lesswrong.com
·
4d
4 days ago
Actions for Neglected Basics of AI Alignment
Towards End to End Motion Planning and Execution for Autonomous Underwater Vehicles Using
Reinforcement
Learning
🌐
World Models
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for Towards End to End Motion Planning and Execution for Autonomous Underwater Vehicles Using Reinforcement Learning
Import AI 460:
Reward
hacking society, RSI data from Anthropic; and
RL-based
quadcopter racing
🌐
World Models
jack-clark.net
·
3d
3 days ago
Actions for Import AI 460: Reward hacking society, RSI data from Anthropic; and RL-based quadcopter racing
A Group of Students Peered Into a Locked Room—and Discovered an Ancient Roman Home
🤖
AI Agents
Content type:
News
popularmechanics.com
·
1d
1 day ago
Actions for A Group of Students Peered Into a Locked Room—and Discovered an Ancient Roman Home
Sign up or log in to see more results
Sign Up
Login
« Page 2
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help