My Attempt at Understanding MCP

MCP had been on my radar for months. I got the concept — a protocol that lets LLMs call external tools through a standardized interface. But I’d only watched from the sidelines. I’ve professionally built the “traditional” version of this: Text2SQL pipelines, semantic search, explicit routing logic. So I had a real question: does MCP actually add value, or is it just a different way to do the same thing? To find out, I picked a domain I love watching — Test cricket — and built a system with two backends: SQL for historical stats, vector search for news. Here’s what I learned. The Mental Shift The traditional approach to building LLM-powered pipelines involves structured orchestration. Even with an LLM in the loop, you design explicit branches and control the flow. Something like this in LangGraph: # Node: LLM classifies the query type def classify_query(state): response = llm.invoke( f“Classify this query as ‘stats’ or ‘news’: {state[‘query’]}“ ) return {“query_type”: response.content} # Conditional edge: Route based on classification def route_by_type(state): if state[“query_type”] == “stats”: return “sql_node” else: return “vector_search_node” # Graph structure graph.add_node(“classify”, classify_query) graph.add_node(“sql_node”, execute_sql) graph.add_node(“vector_search_node”, search_vector_db) graph.add_conditional_edges(“classify”, route_by_type) This works, and it gives you control. But it comes with its own challenges. You need to anticipate query types upfront. Adding a new data source means modifying the graph — new nodes, new branches, updated classification prompts. Mixed queries that need both stats and news? That’s another conditional path to handle. All in all, the orchestration logic is yours to design and maintain. Enter MCP… Instead of designing the graph, you define tools — atomic functions with clear descriptions. The LLM reads these descriptions and decides which to invoke, in what order, based on the query. No conditional edges. No explicit routing. The shift: I describe capabilities. The orchestrator/LLM decides. In my setup, I defined four tools: get_database_schema — returns table structures before generating SQL execute_sql — runs queries against the cricket statistics database search_chromadb — semantic search over news articles query_cricket_articles — fetches fresh news from an external API Each tool does one thing. The descriptions tell the LLM when each is appropriate. I didn’t write a single if-else for routing — the orchestration emerges from the tool definitions themselves. Here’s what the resulting flow looks like: Agent Flowchart Notice the news query path — if search_chromadb returns no results, the agent calls query_cricket_articles to fetch fresh data, ingests it, and searches again. I didn’t code this fallback logic explicitly. It emerged from how I described the tools: “Use when search_chromadb returns no results.” The LLM figured out the chaining. Here’s what defining these tools looks like in the MCP server: from mcp.server import Server from mcp.types import Tool, TextContent app = Server(“cricket-query-server”) @app.list_tools() async def list_tools() -> list[Tool]: return [ Tool( name=“execute_sql”, description=( “Execute SQL query on cricket database (Test cricket 1877-2024). “ “Use get_database_schema first to see available tables.” ), inputSchema={ “type”: “object”, “properties”: { “sql”: { “type”: “string”, “description”: “SQL query to execute (SELECT only)” } }, “required”: [“sql”] } ), # … other tools defined similarly ] @app.call_tool() async def call_tool(name: str, arguments: dict) -> list[TextContent]: if name == “execute_sql”: result = execute_sql(sql=arguments[“sql”]) elif name == “search_chromadb”: result = search_chromadb(query=arguments[“query”]) # … other tools return [TextContent(type=“text”, text=json.dumps(result))] Two decorators, that’s the core of it. list_tools tells the LLM what’s available and when to use each tool. call_tool handles the actual execution when the LLM makes a choice. The description is doing the heavy lifting — it’s not just documentation, it’s the routing instruction. Full implementation: mcp_server.py on GitHub Scaling the System Building the initial version is the easy part. The real question is how painfully it evolves. With structured orchestration, every new capability means touching the graph — new nodes, new classification categories, updated routing logic. The complexity compounds. With MCP, scaling looks different. Vertical Scaling: Adding Depth This is about going deeper within the same domain. Right now, my cricket system has SQL for statistics and vector search for news. What if I want to add Knowledge Graphs for relationship queries — player rivalries, team histories, head-to-head records? With MCP, this is straightforward: Create kg_tools.py with functions like query_player_relationships() Register it in the existing MCP server Update the system prompt to guide when Knowledge Graphs are appropriate The existing SQL and vector search tools don’t change. The new tool sits alongside them, and the LLM decides when to use it based on the description. No refactoring, no rewiring — just add and describe. Horizontal Scaling: Adding Breadth This is about going wider — evolving from cricket intelligence to sports intelligence. Say I want to expand coverage to football, tennis, Formula 1. With MCP, the cleaner approach is a separate server for each domain: cricket-mcp-server football-mcp-server tennis-mcp-server f1-mcp-server Each server is self-contained, with its own data sources and tools. Adding a new sport doesn’t touch existing ones — no interference, no risk of breaking what already works. The agent connects to multiple servers and sees all available tools, routing based on descriptions regardless of which server hosts the tool. A query about Messi goes to football. A query about Kohli goes to cricket. A query about Verstappen goes to F1. Same orchestration layer, same architecture — just broader intelligence. From MCP Server to Agent Defining tools on the server side was straightforward — decorators, descriptions, handlers. The next step is connecting that server to an LLM so it can actually use those tools. MCP uses STDIO as its default transport for local execution. The client spawns the server as a subprocess and communicates through standard input and output. This means working with async streams, context managers, and session handling. Here’s the connection sequence: from mcp import ClientSession, StdioServerParameters from mcp.client.stdio import stdio_client # Define how to spawn the server server_params = StdioServerParameters( command=sys.executable, args=[“path/to/mcp_server.py”], env={**os.environ} ) # Establish the connection self.client_context = stdio_client(server_params) self.read_stream, self.write_stream = await self.client_context.aenter() # Create and initialize the session self.session = ClientSession(self.read_stream, self.write_stream) await self.session.aenter() await self.session.initialize() # Fetch available tools from the server tools_list = await self.session.list_tools() The aenter calls are async context managers setting up resources. The initialize() performs the protocol handshake between client and server. None of this is unique to MCP — it’s standard async Python. But if you’ve spent most of your time in ML frameworks where concurrency is abstracted away, this is the layer where you’ll need to get comfortable. P.S. — I’m working on getting more comfortable with this myself. 😬 Once the connection is established, the tools need to be converted into a format the LLM can work with. In my case, I’m using LangChain, so each MCP tool becomes a StructuredTool: from langchain_core.tools import StructuredTool # Convert MCP tools to LangChain tools langchain_tools = [] for mcp_tool in tools_list.tools: async def tool_func(_tool_name=mcp_tool.name, **kwargs): result = await self.session.call_tool(_tool_name, arguments=kwargs) return result.content[0].text lc_tool = StructuredTool.from_function( coroutine=tool_func, name=mcp_tool.name, description=mcp_tool.description ) langchain_tools.append(lc_tool) # Bind tools to the LLM llm_with_tools = llm.bind_tools(langchain_tools) That’s the loop closed. The MCP server defines what tools exist and what they do. The client connects and fetches them. The tools get bound to the LLM. Now when the LLM decides to call execute_sql or search_chromadb, the request flows through the MCP session back to the server, executes, and returns. Full implementation: mcp_client on GitHub So, Does MCP Add Value? If it isn’t clear from the blog so far — yes. MCP is a genuinely useful addition for anyone building LLM-powered products. The pattern of atomic tools, decoupled servers, and LLM-driven orchestration scales cleanly in ways that structured orchestration doesn’t. That said, it’s not a silver bullet. When you need deterministic behavior — guaranteed execution paths, predictable latency, auditable decisions — structured orchestration still wins. As engineers, it’s our job to weigh the tradeoffs, not adopt frameworks wholesale. So, does MCP offer any real advantage? For the right use cases — yes. It’s not magic, but it’s a cleaner way to scale when flexibility matters more than determinism. The code is on GitHub: cricket-intelligence-system My Attempt at Understanding MCP was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Similar Posts