The Model Context Protocol (MCP) has quickly become a well-adopted standard for connecting large language model (LLM)-based AI agents with external data and tools.
As a result, many developers are stacking their environments with MCP tools. Seventy percent of MCP consumers have two to seven MCP servers configured, according to a late-2025 study by Zuplo.
But every MCP added consumes memory, which is thinning as context windows must store chat histories and configuration files, while agents chain multiple MCPs together into workflows.
In an isolated experiment, Mihály Kávási, founder of One Day BI, a Microsoft Power BI consultancy, calculates that this “hidden tax” accounted for 51% of a 200,000-token context window in Claude Code before running any commands, with MCP tools alone consuming over 16%.
Response sizes from MCP servers can also be unpredictable, often flooding context windows.
“There could be thousands or hundreds of thousands of tokens in a response, and there’s no way to filter that out ahead of time,” Patrick Kelly, co-founder and CEO of Sideko, an API documentation and tools maintenance platform, tells The New Stack.
To address this pain point, Anthropic, creator of MCP, recommends a code execution style in which agents write and execute code to call tools, rather than making direct tool calls. Anthropic claims this approach can deliver a 98.7% reduction in token usage.
Cloudflare has debuted an architecturally similar feature, Code Mode. It’s one example of a broader “code mode” architectural approach that Cloudflare describes as “the better way to use MCP.”
“Code mode is the best way to use tools and MCP servers when you’re building AI agents that are performing a complicated task,” adds Kelly.
To date, code mode has largely remained vendor-constrained. Port of Context, a project open sourced in December, aims to change that by offering a vendor-agnostic and LLM-agnostic implementation.
What is code mode?
Code mode represents an evolution beyond direct MCP tool calling, which can lead to indeterminate outcomes if LLMs are left to run free.