Back to article

Show HN: Mini-swe-agent achieves 65% on SWE-bench in 100 lines of python (opens in new tab)

Covered by 6 sources including DEV Community, Hugging FaceDiscussed on Hacker News and r/LocalLLaMA

Covered in 7 articles

DEV Community·

I Built an AI Issue Triage Bot in 500 Lines of TypeScript

Discussed on DEV

Introducing North Mini Code: Cohere’s First Model For Developers

Discussed on Hacker News

Auditing DeepSWE

Discussed on Hacker News

ParallelKernelBench: Frontier LLMs can't write fast multi-GPU kernels (yet)

deepswe.datacurve.ai·

DeepSWE: A contamination-free benchmark for long-horizon coding agents

Discussed on Hacker News, r/ClaudeAI, r/singularity, and r/vibecoding

deepswe.datacurve.ai·

DeepSWE Benchmark

Discussed on Hacker News and r/ChatGPT

In other languages

Новый бенчмарк DeepSWE: GPT-5.5 — 70%, Opus 4.7