Back to feibilanceon's feed

Show HN: New Benchmark from SWE-bench team is 0% solved (opens in new tab) ⚡Code Generation 6 articles covering this post

programbench.com··Lobsters, Hacker News, r/singularity·Covered by the-decoder.com + 4 more·Open original

ProgramBench evaluates whether language models can rebuild programs from scratch.

Read the original article

Sign in to keep reading the full article.

Covered in 6 articles

Moonshot's open model Kimi K2.7 Code undercuts GPT-5.5 and Claude by up to 12x on price per token

the-decoder.com

·

Unsloth Kimi-K2.7-Code-GGUF

huggingface.co··r/LocalLLaMA

Kimi K2.7-Code: open-source coding model with better token efficiency

huggingface.co··Hacker News, r/LocalLLaMA