Back to article

programbench.com

Show HN: New Benchmark from SWE-bench team is 0% solved (opens in new tab)

Covered by 7 sources including The Decoder, huggingface.coDiscussed on Hacker News, Lobsters, and r/singularity

Covered in 8 articles

·

Moonshot's open model Kimi K2.7 Code undercuts GPT-5.5 and Claude by up to 12x on price per token

huggingface.co·

Unsloth Kimi-K2.7-Code-GGUF

Discussed on r/LocalLLaMA

huggingface.co·

Kimi K2.7-Code: open-source coding model with better token efficiency

Discussed on Hacker News and r/LocalLLaMA

newsletter.techworld-with-milan.com·

The Trends #11: AI agents still can't build software from scratch

·

Latest open artifacts (#21): Open model bonanza! Gemma 4, DeepSeek V4, Kimi K2.6, MiMo 2.5, GLM-5.1 & others. On CAISI's V4 assessment.

techstackups.com·

GLM 5.2 vs. Opus

Discussed on Hacker News

In other languages

Kimi K2.7-Code: 토큰 효율이 개선된 오픈소스 코딩 모델

Новый бенчмарк по кодингу для LLM ProgramBench: 9 топ моделей, 200 задач, 248 тысяч тестов. Полностью решённых