Reinforcement learning, Post training

Feeds to Scour
SubscribedAll
Scoured 102 posts in 6.3 ms

Raize Orion Multi-framework GRC with anchored NIS2 reporting clocks

 🤖AI, LLM,
raizehq.dev··Hacker News

EDPB meets with EU Commissioner McGrath and adopts common data breach notification template

 🤖AI, LLM,
edpb.europa.eu·

Representation-Aware Advantage Estimation: Your Reward Model Provides More Than A Scalar Output

 🤖AI, LLM,  Content type: Academic
arxiv.org·

umair-tareen/philosopher-council: An eleven-philosopher LLM council - ask it questions or point it at AI-research trends. Claude-powered deliberation through the four classical branches of philosophy. Methodology, not metaphysics.

 🤖AI, LLM,  Content type: Code
github.com··r/SideProject

Mult-DPO: Multinomial Direct Preference Optimization for Recommender Systems

 🤖AI, LLM,  Content type: Academic
arxiv.org·

How to Train Your Goblin

 🤖AI, LLM,

Neglected Basics of AI Alignment

 🤖AI, LLM,
lesswrong.com·

Zelenskyy meets with leaders | Arkansas Democrat Gazette

 🏋Training  Content type: News
arkansasonline.com
·

DriveReward: A Comprehensive Dataset and Generative Vision-Language Reward Model for Autonomous Driving

 🤖AI, LLM,  Content type: Academic
arxiv.org·

Room360: Video-to-3D Spatial Reconstruction Platform

 🏋Training  Content type: Blog
huggingface.co·

As Trump turns 80, who are the oldest – and youngest – current world leaders?

 🤖AI, LLM,
pewresearch.org·

X-VPN proves its privacy credentials with new independent no-logs audit

 🏋Training  Content type: News
techradar.com
·

Multilingual Sentiment Aware Text Summarization A Reinforcement Learning Approach for Consistency Maintenance

 🤖AI, LLM,  Content type: Academic
arxiv.org·

At Netroots Nation, Progressives Divided on AI

 🤖AI, LLM,
techpolicy.press·

Would a prepaid pass for a coding agent solve a real need or is it just my itch?

 🤖AI, LLM,

DynaCF: Mitigating Shortcut Learning in Reward Models via Dynamic Counterfactual Sensitivity

 🏋Training  Content type: Academic
arxiv.org·

Mankirat47/Dao-Heart-v3.14: Dao Heart v3.14 : a bounded symbolic AI value governance research scaffold for studying value drift, oversight, warmth preservation, and identity stability under pressure.

 🤖AI, LLM,  Content type: Code
github.com··Hacker News

From pew to pitch, Cameroonian priest builds bridges in French banlieue

 🏋Training  Content type: News
rfi.fr·

The EU Cloud Sovereignty Framework Sets a New Benchmark - for Everyone

 🤖AI, LLM,  Content type: Blog
cirran.eu··r/devops

Microsoft Research's Lens proves detailed captions matter more than raw scale for training efficient image generators

 🏋Training  Content type: News
the-decoder.com
·

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help