Summary
Claude Code loads all tool definitions upfront at session start, which consumes significant context tokens - especially for users with multiple MCP servers, plugins, and agents configured. Anthropic has released beta features specifically designed to address this: Tool Search Tool and Programmatic Tool Calling.
These are documented at: https://www.anthropic.com/engineering/advanced-tool-use
Feature Request
Add support for the following API betas in Claude Code:
1. Tool Search Tool (tool-search-2025-04-15)
Allow tools to be marked with defer_loading: true so they remain discoverable without consuming context tokens at session start. Claude would discover relevant tools on-demand via a sear…
Summary
Claude Code loads all tool definitions upfront at session start, which consumes significant context tokens - especially for users with multiple MCP servers, plugins, and agents configured. Anthropic has released beta features specifically designed to address this: Tool Search Tool and Programmatic Tool Calling.
These are documented at: https://www.anthropic.com/engineering/advanced-tool-use
Feature Request
Add support for the following API betas in Claude Code:
1. Tool Search Tool (tool-search-2025-04-15)
Allow tools to be marked with defer_loading: true so they remain discoverable without consuming context tokens at session start. Claude would discover relevant tools on-demand via a search mechanism.
Reported benefits:
- 85% reduction in token usage while maintaining full tool access
- Significant accuracy improvements (Opus 4: 49% → 74%, Opus 4.5: 79.5% → 88.1%)
2. Programmatic Tool Calling (programmatic-tool-use-2025-04-15)
Allow Claude to orchestrate multiple tools through code execution rather than individual API round-trips, with only final results entering context.
Reported benefits:
- 37% token reduction on complex multi-tool tasks
- Eliminates inference overhead from multiple round-trips
Use Case
Users with extensive setups (multiple MCP servers like filesystem, github, puppeteer, brave-search, plus plugins with agents/skills/commands) are paying a substantial token cost on every session. These betas would allow:
- MCP server tools to defer loading until actually needed
- Plugin-defined tools/agents to use deferred discovery
- Complex multi-tool workflows to execute more efficiently
Proposed Implementation
- Add configuration options (perhaps in
settings.jsonor.claude/settings.json) to enable these betas for users who want them - Support
defer_loadingflag in MCP server tool configurations - Support
allowed_callersfor programmatic tool execution
Additional Context
Users with API/developer platform accounts already have access to these betas when using the API directly - this would bring that capability to Claude Code.