Back to article

thezvi.substack.com

GLM-5.2 Is the New Best Open Model (opens in new tab)

Covers 5 stories including Google's latest Gemini-exp-1206 seems to be great, near the top of livebenchDiscussed on Substack

Covers 5 related stories

Google's latest Gemini-exp-1206 seems to be great, near the top of livebench

Discussed on Hacker News and r/LocalLLaMA

How is it that Google's Gemini Pro 2.0 Experimental 02-05 Tops the LLM Arena Charts, but seems to perform badly in real world testing?

Discussed on r/LocalLLaMA and r/programming

posttrainbench.com·

PostTrainBench: Measuring how well AI agents can post-train language models

Discussed on Hacker News

Public Enterprise LLM Benchmarks

Llama-4 fails at long context writing

Discussed on Hacker News and r/LocalLLaMA