MCP-CLI is an experimental approach to MCP tool calling that dramatically reduces token consumption in Claude Code

Summary

Claude Code loads all tool definitions upfront at session start, which consumes significant context tokens - especially for users with multiple MCP servers, plugins, and agents configured. Anthropic has released beta features specifically designed to address this: Tool Search Tool and Programmatic Tool Calling.

These are documented at: https://www.anthropic.com/engineering/advanced-tool-use

Feature Request

Add support for the following API betas in Claude Code:

1. Tool Search Tool (`tool-search-2025-04-15`)

Allow tools to be marked with defer_loading: true so they remain discoverable without consuming context tokens at session start. Claude would discover relevant tools on-demand via a sear…

Summary

These are documented at: https://www.anthropic.com/engineering/advanced-tool-use

Feature Request

Add support for the following API betas in Claude Code:

1. Tool Search Tool (`tool-search-2025-04-15`)

Allow tools to be marked with defer_loading: true so they remain discoverable without consuming context tokens at session start. Claude would discover relevant tools on-demand via a search mechanism.

Reported benefits:

85% reduction in token usage while maintaining full tool access
Significant accuracy improvements (Opus 4: 49% → 74%, Opus 4.5: 79.5% → 88.1%)

2. Programmatic Tool Calling (`programmatic-tool-use-2025-04-15`)

Allow Claude to orchestrate multiple tools through code execution rather than individual API round-trips, with only final results entering context.

Reported benefits:

37% token reduction on complex multi-tool tasks
Eliminates inference overhead from multiple round-trips

Use Case

Users with extensive setups (multiple MCP servers like filesystem, github, puppeteer, brave-search, plus plugins with agents/skills/commands) are paying a substantial token cost on every session. These betas would allow:

MCP server tools to defer loading until actually needed
Plugin-defined tools/agents to use deferred discovery
Complex multi-tool workflows to execute more efficiently

Proposed Implementation

Add configuration options (perhaps in settings.json or .claude/settings.json) to enable these betas for users who want them
Support defer_loading flag in MCP server tool configurations
Support allowed_callers for programmatic tool execution

Additional Context

Users with API/developer platform accounts already have access to these betas when using the API directly - this would bring that capability to Claude Code.

Summary

Feature Request

1. Tool Search Tool (tool-search-2025-04-15)

Summary

Feature Request

1. Tool Search Tool (tool-search-2025-04-15)

2. Programmatic Tool Calling (programmatic-tool-use-2025-04-15)

Use Case

Proposed Implementation

Additional Context

Similar Posts

1. Tool Search Tool (`tool-search-2025-04-15`)

1. Tool Search Tool (`tool-search-2025-04-15`)

2. Programmatic Tool Calling (`programmatic-tool-use-2025-04-15`)