AI Agent Benchmarks Are Broken
ddkang.substack.com·4h·
Discuss: Substack
Seeing Like an LLM
strangeloopcanon.com·1d·
Discuss: Hacker News
Why, Why, Why, Eliza?
learningfromexamples.com·22h·
Discuss: Hacker News