Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
RL
🎮 RL
Specific
reinforcement learning, RLHF, reward model, policy gradient
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
452
posts in
9.8
ms
Phi-Actor-Critic
: Steering General-Sum Games to Pareto-Efficient Correlated Equilibria
🕵️
AI Agents
Content type:
Academic
arxiv.org
·
13h
13 hours ago
Actions for Phi-Actor-Critic: Steering General-Sum Games to Pareto-Efficient Correlated Equilibria
Less-relevant results
Major Types of Machine
Learning
👁️
VLMs
Content type:
Blog
medium.com
·
6d
6 days ago
Actions for Major Types of Machine Learning
Microsoft Research's Lens proves detailed captions matter more than raw scale for training efficient image generators
🔓
Open-source Models
Content type:
News
the-decoder.com
·
2d
2 days ago
Actions for Microsoft Research's Lens proves detailed captions matter more than raw scale for training efficient image generators
Lodge School teams advance to volleyball quarter-finals
🎭
Multimodal AI
cbc.bb
·
5d
5 days ago
Actions for Lodge School teams advance to volleyball quarter-finals
Siri AI is powered by Gemini
models
, but is not Gemini – what does that mean?
🔓
Open-source Models
9to5mac.com
·
5h
5 hours ago
Actions for Siri AI is powered by Gemini models, but is not Gemini – what does that mean?
Geometrically Averaged Hard Target Updates for Linear
Q-Learning
⚡
Quantization
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Geometrically Averaged Hard Target Updates for Linear Q-Learning
Comp.compilers: Paper: MileStone: A Multi-Objective Compiler Phase Ordering Framework for Graph-based IR-Level Optimization
🖥️
Inference Compute
compilers.iecc.com
·
5d
5 days ago
Actions for Comp.compilers: Paper: MileStone: A Multi-Objective Compiler Phase Ordering Framework for Graph-based IR-Level Optimization
Are Classical Machine
Learning
Jobs Dying?
💹
AI in Finance
Content type:
Blog
medium.com
·
2d
2 days ago
Actions for Are Classical Machine Learning Jobs Dying?
Why Claude Produces High-Quality Output: A Developer’s Guide to Token Efficiency and Hallucination…
🧠
LLMs
Content type:
Blog
medium.com
·
6d
6 days ago
Actions for Why Claude Produces High-Quality Output: A Developer’s Guide to Token Efficiency and Hallucination…
Model
predictive task sampling for efficient and robust adaptation
🖥️
Inference Compute
Content type:
Academic
nature.com
·
2d
2 days ago
Actions for Model predictive task sampling for efficient and robust adaptation
Fast and Highly Expressive
Policy
Learning
for Offline
Reinforcement
Learning
via Bootstrapped Flow
Q-Learning
🧠
LLMs
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Fast and Highly Expressive Policy Learning for Offline Reinforcement Learning via Bootstrapped Flow Q-Learning
Memoirs of a
Learning
Machine: Autobiographical Self-Training and the Self-Training Gap
🕵️
AI Agents
zenodo.org
·
4d
4 days ago
·
Hacker News
Actions for Memoirs of a Learning Machine: Autobiographical Self-Training and the Self-Training Gap
Robots are closing in on human-like judgments, addressing a key challenge in physical AI
🤖
Embodied AI
techxplore.com
·
22h
22 hours ago
Actions for Robots are closing in on human-like judgments, addressing a key challenge in physical AI
Beyond Dexterity: Why Contact May Define the Next Era of Robotics
🦾
Robotics
Content type:
Video
Content type:
News
spectrum.ieee.org
·
2d
2 days ago
·
Hacker News
Actions for Beyond Dexterity: Why Contact May Define the Next Era of Robotics
Hey-Meadow/meadow-mind: Zero training, second-level reactions (~400ms). A language-rule
decision
mind on a local 7B diffusion LM.
🔧
Tool Use
Content type:
Code
github.com
·
23h
23 hours ago
·
Hacker News
Actions for Hey-Meadow/meadow-mind: Zero training, second-level reactions (~400ms). A language-rule decision mind on a local 7B diffusion LM.
Mult-DPO
: Multinomial Direct Preference Optimization for Recommender Systems
👁️
VLMs
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Mult-DPO: Multinomial Direct Preference Optimization for Recommender Systems
Google DeepMind's Susan Zhang argues abundant AI content shifts the premium from raw intelligence to human relationships and social dynamics
🔓
Open-source Models
Content type:
News
digg.com
·
3d
3 days ago
Actions for Google DeepMind's Susan Zhang argues abundant AI content shifts the premium from raw intelligence to human relationships and social dynamics
Weekly Research Recap
💹
AI in Finance
Content type:
News
quantseeker.com
·
1d
1 day ago
Actions for Weekly Research Recap
local AI agents for Cursor with pre-tuned marketplace/commu
🕵️
AI Agents
locaible.com
·
1d
1 day ago
·
Hacker News
Actions for local AI agents for Cursor with pre-tuned marketplace/commu
I built a machine that turns AI papers into interactive explainers
🧠
LLMs
Content type:
Blog
blog.skz.dev
·
6d
6 days ago
Actions for I built a machine that turns AI papers into interactive explainers
Sign up or log in to see more results
Sign Up
Login
« Page 2
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help