datacurve-ai/deep-swe: Measuring frontier coding agents on original, long-horizon engineering tasks (opens in new tab)
Measuring frontier coding agents on original, long-horizon engineering tasks - datacurve-ai/deep-swe
Read the original articleMeasuring frontier coding agents on original, long-horizon engineering tasks - datacurve-ai/deep-swe
Read the original article