Intelligence is not just about task completion
marble.onl·6d·
Discuss: Hacker News
💬AI Code Assistants
Preview
Report Post

Andrew Marble marble.onl andrew@willows.ai January 1, 2026

I wrote recently about “Red Teaming Eliza1”. The premise is that if you run common AI safety evaluations on a “dumb” system that effectively just repeats your words back to you, it still gets flagged as dangerous, because the evals assume some base level of agency that isn’t there. I’m going to argue that there is a similar mistaken premise when we try and measure the intelligence of AI systems.

Here is an example question from the ARC-AGI 2 Benchmark2 that claims “ARC-AGI is the only AI benchmark that measures our progress towards general intelligence.”

The task is to extrapolate from the examples on top to figure out what the miss…

Similar Posts

Loading similar posts...

Keyboard Shortcuts

Navigation
Next / previous item
j/k
Open post
oorEnter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help