Could LLM alignment research reduce x-risk if the first takeover-capable AI is not an LLM?
lesswrong.com·2d
🏆LLM Benchmarking
Preview
Report Post

Published on January 19, 2026 6:09 PM GMT

Many people believe that the first AI capable of taking over would be quite different from the LLMs of today. Suppose this is true—does prosaic alignment research on LLMs still reduce x-risk? I believe advances in LLM alignment research reduce x-risk even if future AIs are different. I’ll call these “non-LLM AIs.” In this post, I explore two mechanisms for LLM alignment research to reduce x-risk:

  • Direct transfer: We can directly apply the research to non-LLM AIs—for example, reusing behavioral evaluations or retraining model organisms. As I wrote this post, I was surprised by how much research may transfer directly.
  • Indirect transfer: The LLM could be involved in training, control, an…

Similar Posts

Loading similar posts...

Keyboard Shortcuts

Navigation
Next / previous item
j/k
Open post
oorEnter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help