Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
You're currently offline. Some features may not work.
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🎯 Reinforcement Learning
Q-learning, Policy Gradient, Reward Functions, TD Learning
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
113136
posts in
319.4
ms
check out this
article
on Reinforcement Learning with R:
Origins
, Real-Life Applications, and Practical Implementation
dev.to
·
2d
·
Discuss:
DEV
💬
Prompt Engineering
Optimistic
Training and
Convergence
of Q-Learning -- Extended Version
arxiv.org
·
3d
💬
Prompt Engineering
On-Policy Policy
Gradient
Reinforcement Learning Without On-Policy
Sampling
arxiv.org
·
1d
💬
Prompt Engineering
Optimizing post-disaster road
restoration
with reinforcement learning: A
traveler-behavior-aware
approach
sciencedirect.com
·
10h
💬
Prompt Engineering
Show HN:
Fighting
the War Against
Expensive
Reinforcement Learning
cadenza-landing-qtu7gbjwb-akshparekh123-3457s-projects.vercel.app
·
19h
·
Discuss:
Hacker News
💬
Prompt Engineering
BetaZero
V2: A Diffusion Model for Setting
Boulder
Problems
evmojo37.substack.com
·
3h
·
Discuss:
Substack
🗣️
LLMs
A
Conceptual
Framework for Exploration
Hacking
lesswrong.com
·
10h
💬
Prompt Engineering
Feedback
Control for Computer Systems
janert.org
·
19h
🐚
Shell Scripting
How to
Leverage
Explainable
AI for Better Business Decisions
towardsdatascience.com
·
12h
💬
Prompt Engineering
A training
principle
for
drifting
models
breno.bearblog.dev
·
15h
🧠
Machine Learning
A multi-agent reinforcement learning approach to autonomous aircraft
taxiing
with
taxiing
time, fuel consumption, and
emission
optimization
sciencedirect.com
·
1d
💬
Prompt Engineering
Show HN: A
minimal
online decision maker
decisionmaker.online
·
1d
·
Discuss:
Hacker News
🧠
Cognitive Science
Optimal
timing
for
superintelligence
feeds.feedblitz.com
·
2h
💬
Prompt Engineering
Researchers propose a self-distillation fix for ‘
catastrophic
forgetting
’ in LLMs
infoworld.com
·
16h
💬
Prompt Engineering
The 4 Mixture of Experts Architectures: How to Train
100B
Models at
10B
Cost
pub.towardsai.net
·
14h
💬
Prompt Engineering
v6 (Code 2 here) — Most complete architecture. This version is faster than my old v5,
statistically
correct, has all the advanced psychology/network features, and produces stunning
visualizations
gist.github.com
·
8h
·
Discuss:
r/C_Programming
🧠
Cognitive Science
Gibbs Measures from Deep Shaped
Multilayer
Perceptrons
link.aps.org
·
14h
🗣️
LLMs
Hybrid neural–cognitive models reveal how memory
shapes
human
reward
learning
nature.com
·
5d
🧠
Cognitive Science
In defense of
wasting
time
fastcompany.com
·
7h
📵
Digital Minimalism
A “
Toolbox
”
Pipeline
for Robots That See, Read, and Act
hackernoon.com
·
2h
💬
Prompt Engineering
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help