RT by @AravSrinivas: GLM is pretty solid — it gets about an 80% pass rate on our internal financial benchmark. By contrast, DeepSeek v4, Kimi and MiniMax are be... (opens in new tab)
GLM is pretty solid — it gets about an 80% pass rate on our internal financial benchmark. By contrast, DeepSeek v4, Kimi and MiniMax are below 5%. (Considered Opus 4.8 as baseline & judger)
Read the original article