Back to article

Auditing DeepSWE (opens in new tab)

Covers 13 stories including ArXiv Is Down. Another DDoS? Related to Internet Archive?Discussed on Hacker News

Covers 13 related stories

ArXiv Is Down. Another DDoS? Related to Internet Archive?

Discussed on Hacker News

OpenAI Codex CLI: Lightweight coding agent that runs in your terminal

Discussed on Hacker News, Lobsters, and r/LocalLLaMA

deepswe.datacurve.ai·

DeepSWE Benchmark

Discussed on Hacker News and r/ChatGPT

Enterprise AI: Private, Secure, Customizable

deepswe.datacurve.ai·

DeepSWE: A contamination-free benchmark for long-horizon coding agents

Discussed on Hacker News, r/ClaudeAI, r/singularity, and r/vibecoding

Why SWE-bench Verified no longer measures frontier coding capabilities

Discussed on Hacker News and r/LocalLLaMA

scholar.google.com·

Google Scholar

Show HN: Mini-swe-agent achieves 65% on SWE-bench in 100 lines of python

Discussed on Hacker News and r/LocalLLaMA

semanticscholar.org·

Semantic Scholar - A free, AI-powered research tool for scientific literature

Discussed on Hacker News

datacurve-ai/deep-swe: Measuring frontier coding agents on original, long-horizon engineering tasks