Simplicity always wins:SOTA on swe-pro,tb2,-verif on 21 models with simple-agent (opens in new tab)
Strands-based agents and harnesses for agentic benchmarks. - strands-labs/benchmark-harnesses
Read the original articleStrands-based agents and harnesses for agentic benchmarks. - strands-labs/benchmark-harnesses
Read the original article