Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Reinforcement learning, Post training
🎯 Reinforcement learning, Post training
Specific
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
102
posts in
6.3
ms
Raize Orion Multi-framework GRC with anchored NIS2 reporting clocks
🤖
AI, LLM,
raizehq.dev
·
5d
5 days ago
·
Hacker News
Actions for Raize Orion Multi-framework GRC with anchored NIS2 reporting clocks
EDPB meets with EU Commissioner McGrath and adopts common data breach notification template
🤖
AI, LLM,
edpb.europa.eu
·
1d
1 day ago
Actions for EDPB meets with EU Commissioner McGrath and adopts common data breach notification template
Representation-Aware Advantage Estimation: Your
Reward
Model
Provides More Than A Scalar Output
🤖
AI, LLM,
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Representation-Aware Advantage Estimation: Your Reward Model Provides More Than A Scalar Output
umair-tareen/philosopher-council: An eleven-philosopher LLM council - ask it questions or point it at AI-research trends. Claude-powered deliberation through the four classical branches of philosophy. Methodology, not metaphysics.
🤖
AI, LLM,
Content type:
Code
github.com
·
6d
6 days ago
·
r/SideProject
Actions for umair-tareen/philosopher-council: An eleven-philosopher LLM council - ask it questions or point it at AI-research trends. Claude-powered deliberation through the four classical branches of philosophy. Methodology, not metaphysics.
Mult-DPO
: Multinomial Direct
Preference
Optimization for Recommender Systems
🤖
AI, LLM,
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Mult-DPO: Multinomial Direct Preference Optimization for Recommender Systems
How to
Train
Your Goblin
🤖
AI, LLM,
goblins.mchen.workers.dev
·
4d
4 days ago
·
Hacker News
,
Hacker News
Actions for How to Train Your Goblin
Neglected Basics of AI
Alignment
🤖
AI, LLM,
lesswrong.com
·
4d
4 days ago
Actions for Neglected Basics of AI Alignment
Zelenskyy meets with leaders | Arkansas Democrat Gazette
🏋
Training
Content type:
News
arkansasonline.com
·
1d
1 day ago
Actions for Zelenskyy meets with leaders | Arkansas Democrat Gazette
DriveReward: A Comprehensive Dataset and Generative Vision-Language
Reward
Model
for Autonomous Driving
🤖
AI, LLM,
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for DriveReward: A Comprehensive Dataset and Generative Vision-Language Reward Model for Autonomous Driving
Room360: Video-to-3D Spatial Reconstruction Platform
🏋
Training
Content type:
Blog
huggingface.co
·
4d
4 days ago
Actions for Room360: Video-to-3D Spatial Reconstruction Platform
As Trump turns 80, who are the oldest – and youngest – current world leaders?
🤖
AI, LLM,
pewresearch.org
·
2d
2 days ago
Actions for As Trump turns 80, who are the oldest – and youngest – current world leaders?
X-VPN proves its privacy credentials with new independent no-logs audit
🏋
Training
Content type:
News
techradar.com
·
3d
3 days ago
Actions for X-VPN proves its privacy credentials with new independent no-logs audit
Multilingual Sentiment Aware Text Summarization A
Reinforcement
Learning
Approach for Consistency Maintenance
🤖
AI, LLM,
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for Multilingual Sentiment Aware Text Summarization A Reinforcement Learning Approach for Consistency Maintenance
At Netroots Nation, Progressives Divided on AI
🤖
AI, LLM,
techpolicy.press
·
1d
1 day ago
Actions for At Netroots Nation, Progressives Divided on AI
Would a prepaid pass for a coding agent solve a real need or is it just my itch?
🤖
AI, LLM,
codehamr.com
·
6d
6 days ago
·
r/SideProject
Actions for Would a prepaid pass for a coding agent solve a real need or is it just my itch?
DynaCF: Mitigating Shortcut
Learning
in
Reward
Models
via Dynamic Counterfactual Sensitivity
🏋
Training
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for DynaCF: Mitigating Shortcut Learning in Reward Models via Dynamic Counterfactual Sensitivity
Mankirat47/Dao-Heart-v3.14: Dao Heart v3.14 : a bounded symbolic AI value governance research scaffold for studying value drift, oversight, warmth preservation, and identity stability under pressure.
🤖
AI, LLM,
Content type:
Code
github.com
·
1d
1 day ago
·
Hacker News
Actions for Mankirat47/Dao-Heart-v3.14: Dao Heart v3.14 : a bounded symbolic AI value governance research scaffold for studying value drift, oversight, warmth preservation, and identity stability under pressure.
From pew to pitch, Cameroonian priest builds bridges in French banlieue
🏋
Training
Content type:
News
rfi.fr
·
4d
4 days ago
Actions for From pew to pitch, Cameroonian priest builds bridges in French banlieue
The EU Cloud Sovereignty Framework Sets a New Benchmark - for Everyone
🤖
AI, LLM,
Content type:
Blog
cirran.eu
·
2d
2 days ago
·
r/devops
Actions for The EU Cloud Sovereignty Framework Sets a New Benchmark - for Everyone
Microsoft Research's Lens proves detailed captions matter more than raw scale for
training
efficient image generators
🏋
Training
Content type:
News
the-decoder.com
·
3d
3 days ago
Actions for Microsoft Research's Lens proves detailed captions matter more than raw scale for training efficient image generators
« Page 1
·
Page 3 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help