shrink ray
Tiny, fast model hits coding scores similar to GPT-5 and Sonnet 4.
On Wednesday, Anthropic released Claude Haiku 4.5, a small AI language model that reportedly delivers performance similar to what its frontier model Claude Sonnet 4 achieved five months ago but at one-third the cost and more than twice the speed. The new model is available now to all Claude app, web, and API users.
If the benchmarks for Haiku 4.5 reported by Anthropic hold up to independent testing, the fact that the company can match some capabilities of its cutting-edge coding model from only five months ago (and GPT-5 in coding) while providing a dramatic …
shrink ray
Tiny, fast model hits coding scores similar to GPT-5 and Sonnet 4.
On Wednesday, Anthropic released Claude Haiku 4.5, a small AI language model that reportedly delivers performance similar to what its frontier model Claude Sonnet 4 achieved five months ago but at one-third the cost and more than twice the speed. The new model is available now to all Claude app, web, and API users.
If the benchmarks for Haiku 4.5 reported by Anthropic hold up to independent testing, the fact that the company can match some capabilities of its cutting-edge coding model from only five months ago (and GPT-5 in coding) while providing a dramatic speed increase and cost cut is notable.
As a recap, Anthropic ships the Claude family in three model sizes: Haiku (small), Sonnet (medium), and Opus (large). The larger models are based on larger neural networks and typically include deeper contextual knowledge but are slower and more expensive to run. Due to a technique called distillation, companies like Anthropic have been able to craft smaller-sized AI models that match the capability of larger, older models at functional tasks like coding, although it typically comes at the cost of omitting stored knowledge.
Claude 4.5 Haiku benchmark results from Anthropic.
That means if you wanted to converse with an AI model that might craft a deeper and more meaningful analysis of, say, foreign policy or world history, you might be better served talking to Sonnet or Opus (being aware that they can also be wrong and make things up). But if you just need quick coding assistance that’s more about translation of concepts than general knowledge, Haiku might be the better pick due to its speed and lower cost.
And speaking of cost, Haiku 4.5 is included for subscribers of Claude web and app plans. Through the API (for developers), the small model is priced at $1-per-million input tokens and $5-per-million output tokens. That compares to Sonnet 4.5 at $3-per-million input and $15-per-million output tokens, and Opus 4.1 at $15-per-million input and a whopping $75-per-million output tokens.
The model serves as a cheaper drop-in replacement for two older models, Haiku 3.5 and Sonnet 4. “Users who rely on AI for real-time, low-latency tasks like chat assistants, customer service agents, or pair programming will appreciate Haiku 4.5’s combination of high intelligence and remarkable speed,” Anthropic writes.
Claude 4.5 Haiku answers the classic Ars Technica AI question, “Would the color be called ‘magenta’ if the town of Magenta didn’t exist?”
On SWE-bench Verified, a test that measures performance on coding tasks, Haiku 4.5 scored 73.3 percent compared to Sonnet 4’s similar performance level (72.7 percent). The model also reportedly surpasses Sonnet 4 at certain tasks like using computers, according to Anthropic’s benchmarks. Claude Sonnet 4.5, released in late September, remains Anthropic’s frontier model and what the company calls “the best coding model available.”
Haiku 4.5 also surprisingly edges up close to what OpenAI’s GPT-5 can achieve in this particular set of benchmarks (as seen in the chart above), although since the results are self-reported and potentially cherry-picked to match a model’s strengths, one should always take them with a grain of salt.
Still, making a small, capable coding model may have unexpected advantages for agentic coding setups like Claude Code. Anthropic designed Haiku 4.5 to work alongside Sonnet 4.5 in multi-model workflows. In such a configuration, Anthropic says, Sonnet 4.5 could break down complex problems into multi-step plans, then coordinate multiple Haiku 4.5 instances to complete subtasks in parallel, like spinning off workers to get things done faster.
For more details on the new model, Anthropic released a system card and documentation for developers.
Benj Edwards is Ars Technica’s Senior AI Reporter and founder of the site’s dedicated AI beat in 2022. He’s also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.