Prompt Cache Break: Hit-Rate Fell 100% to 40% in 40 Lines (opens in new tab)

Discussed on DEV

In short: a prompt cache-break is when one change atop your prompt prefix — a fresh timestamp, a reordered tool block — makes the cache miss from there down, so cached reads silently re-bill as fresh input. cache_break.py hashes each prefix segment, localizes the break, and fails CI. On my fixture, one timestamp dropped the estimated cache-hit-rate 100% to 40%. AI disclosure: I drafted this with an AI writing assistant. The tool, the two fixtures, and every number below come from a real local...

Read the original article