Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
You're currently offline. Some features may not work.
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
馃 reinforcement learning
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
5628
posts in
73.2
ms
On
Computation
and
Reinforcement
Learning
arxiv.org
路
6d
馃З
operations research
Recursive
self-improvement
from AI models
marginalrevolution.com
路
1d
路
Discuss:
Hacker News
馃搳
linear programming
Show HN: A
minimal
online decision maker
decisionmaker.online
路
18h
路
Discuss:
Hacker News
馃З
operations research
A Policy-Aware Agent Loop with
Cedar
and
OpenClaw
windley.com
路
14h
馃搳
linear programming
ashworks1706/rlhf-from-scratch
: A theoretical and practical deep dive into Reinforcement Learning with Human Feedback and it鈥檚 applications in Large Language Models from scratch.
github.com
路
1d
路
Discuss:
Hacker News
馃З
operations research
Distributional
Reinforcement Learning with Diffusion Bridge
Critics
arxiv.org
路
6d
馃搳
linear programming
EyesOff
: Why Some Models
Quantize
Better Than Others
ym2132.github.io
路
8h
路
Discuss:
Hacker News
馃搳
linear programming
Adaptive
Neuro-Symbolic
Planning for smart agriculture
microgrid
orchestration in hybrid quantum-classical pipelines
dev.to
路
3d
路
Discuss:
DEV
馃搳
linear programming
The "Are You
Sure
?" Problem: Why Your AI Keeps Changing Its
Mind
randalolson.com
路
12h
路
Discuss:
Hacker News
馃З
operations research
Benchmark
&
Compare
the Best AI Models
arena.ai
路
16h
馃搳
linear programming
Show HN: Vibe
Coded
Math
Games
eruci.com
路
13h
路
Discuss:
Hacker News
馃搳
linear programming
Learning Models with Uniform Performance via
Distributionally
RobustOptimization
dev.to
路
4d
路
Discuss:
DEV
馃搳
linear programming
Overview of end-to-end
encrypted
AI inference for
Confer
news.ycombinator.com
路
13h
路
Discuss:
Hacker News
馃搳
linear programming
Building
Chess
in about 350 lines of
Clojure
sammystraus.com
路
5h
路
Discuss:
Hacker News
馃
Rust
On
Economics
of A(S)I Agents
lesswrong.com
路
4d
馃З
operations research
Digitizing
the "
Shokunin
": How we encoded a Master's hammer strike into AI
yusukekaizen.substack.com
路
39m
路
Discuss:
Substack
馃搳
linear programming
Show HN:
Implementing
an AI
Portfolio
Manager. With Learning
quantape.substack.com
路
6h
路
Discuss:
Substack
馃搳
linear programming
Schedules
of Reinforcement in
Psychology
(Examples)
simplypsychology.org
路
1d
路
Discuss:
Hacker News
馃З
operations research
Don't give away to the
gradient
descent
carteakey.dev
路
7h
路
Discuss:
Hacker News
馃З
operations research
Tuning
to
Experiential
Learning
sounding.com
路
2d
路
Discuss:
Hacker News
馃З
operations research
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help