RLHF

Reinforcement Learning from Human Feedback, Reward Modeling, Preference Learning, Alignment

Feeds to Scour
SubscribedAll
Scoured 76 posts in 9.7 ms

SLUUG Talk: Demystifying Large Language Models on Linux

 🔄Transformers  Content type: Code
github.com··DEV
Less-relevant results

EDPB meets with EU Commissioner McGrath and adopts common data breach notification template

 post training infra
edpb.europa.eu·

Beyond the Golden Teacher: Enhancing Graph Learning through LLM-GNN Co-teaching

 post training infra  Content type: Academic
arxiv.org·

Cisco AI Defense Policy Studio: Turning Unwritten Policy into Adaptive AI Guardrails

 🤖agentic system  Content type: Blog
blogs.cisco.com·

I built a machine that turns AI papers into interactive explainers

 🎛️Fine-Tuning  Content type: Blog
blog.skz.dev·

Training LLMs to Enforce Multi-Level Instruction Hierarchies via Gravity-Weighted Direct Preference Optimization

 post training infra  Content type: Academic
arxiv.org·

Neglected Basics of AI Alignment

 post training infra
lesswrong.com·

PAWS: Preference Learning with Advantage-Weighted Segments

 🔄Transformers  Content type: Academic
arxiv.org·

A free diagnostic for the Claude Certified Architect exam

 post training infra  Content type: Discussion  Content type: Tutorial

Sequent: scale and automation for higher confidence in alignment

 post training infra
lesswrong.com·

A Unifying Lens on Reward Uncertainty in RLHF

 post training infra  Content type: Academic
arxiv.org·

Raize Orion Multi-framework GRC with anchored NIS2 reporting clocks

 post training infra
raizehq.dev··Hacker News

The EU Cloud Sovereignty Framework Sets a New Benchmark - for Everyone

 post training infra  Content type: Blog
cirran.eu··r/devops

Compatibility-Aware Dynamic Fine-Tuning for Large Language Models

 🎛️Fine-Tuning  Content type: Academic
arxiv.org·

My research agenda and work

 🔄Transformers
lesswrong.com·

Multilingual Sentiment Aware Text Summarization A Reinforcement Learning Approach for Consistency Maintenance

 post training infra  Content type: Academic
arxiv.org·

Bounding-box composition control in Ideogram 4 — what works, what breaks

 post training infra  Content type: Code

AWS Destroyed the Value Proposition for Bedrock

 post training infra  Content type: Blog
securosis.com·

The Periodic Table of LLM Reasoning: A Structured Survey of Reasoning Paradigms, Methods, and Failure Modes

 📊LLM Evaluation  Content type: Academic
arxiv.org·

Emergence of Context Characteristics Sensitivity in Large Language Models

 post training infra  Content type: Academic
arxiv.org·

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help