Tool output compression for agents - 60-70% token reduction on tool-heavy workloads (open source, works with local models) (opens in new tab) 16 articles covering this post
The Context Optimization Layer for LLM Applications - chopratejas/headroom
Read the original article