Your AI Bill Isn't a Model Problem. It's an Architecture Problem. (opens in new tab)
If your LLM costs are climbing, the instinct is almost always the same: swap to a cheaper model. GPT-4 to GPT-4-mini. Claude Opus to Claude Haiku. Sometimes that helps a little. It rarely fixes the actual problem. The actual problem, in most workflows I've looked at, is that every step gets routed through the LLM, even the steps that don't need language reasoning at all. This post breaks down a simple mental model for deciding what should and shouldn't touch an LLM, with a working example you...
Read the original article