Alignment Is Capability
off-policy.com·2h·
Discuss: Hacker News
🛡️AI Safety
Preview
Report Post

Here’s a claim that might actually be true: alignment is not a constraint on capable AI systems. Alignment is what capability is at sufficient depth.

A model that aces benchmarks but doesn’t understand human intent is just less capable. Virtually every task we give an LLM is steeped in human values, culture, and assumptions. Miss those, and you’re not maximally useful. And if it’s not maximally useful, it’s by definition not AGI.

OpenAI and Anthropic have been running this experiment for two years. The results are coming in.


The Experiment

Anthropic and OpenAI have taken different approaches to the relationship between alignment and capability work.

Anthropic’s approach: Alignment researchers are embedded in capability work. There’s no clear split.

From Jan L…

Similar Posts

Loading similar posts...