Large language models can generate impressive language, but they still struggle to operate effectively within enterprise systems. To make these models useful, you need more than good prompts. You need a reliable way for models to find the right context, call the right tools, follow enterprise policies, and leave an auditable record of their actions.
The Model Context Protocol (MCP) offers that foundation. MCP standardizes how models discover, select, and call tools (Figure 1). This helps developers move from simple chatbots to reliable, active applications without reinventing every int...
Large language models can generate impressive language, but they still struggle to operate effectively within enterprise systems. To make these models useful, you need more than good prompts. You need a reliable way for models to find the right context, call the right tools, follow enterprise policies, and leave an auditable record of their actions.
The Model Context Protocol (MCP) offers that foundation. MCP standardizes how models discover, select, and call tools (Figure 1). This helps developers move from simple chatbots to reliable, active applications without reinventing every integration from scratch.
Figure 1: Model Context Protocol provides a standard for calling tools, services, and more from an AI application.
The evolution of interacting with LLMs
Early LLM applications were simple. You asked a question and the model responded based on the data it was trained on (Figure 2). If you typed "What is Red Hat?", the model searched its internal parameters for an answer drawn from the public internet. These responses reflected the model provider's training set, not your organization's unique knowledge.
Figure 2: Typical interactions with an LLM answering solely based on its training data.
That limitation quickly became obvious. Enterprise data—from design documents and Jira tickets to meeting transcripts and product wikis—lived outside the model's reach. Without that context, responses were generic and often incomplete. Developers began exploring ways to bring private knowledge into the model's reasoning process and context window (the amount of information it can "see" at once).
Chatting with your documents: RAG as a first step
The first major breakthrough was retrieval-augmented generation (RAG). RAG connects models to vector or graph databases so they can perform semantic search and retrieve information relevant to a query.
Imagine asking, "How do I create a data science project in Red Hat OpenShift AI?" A RAG pipeline searches internal documentation, retrieves the top results, and passes them into the model's context window. The model then generates a precise, grounded answer.
Figure 3 shows this pipeline in action. By supplementing what the model already knows with external knowledge, RAG makes answers much more accurate and up-to-date. Still, it requires constant retrieval, even for simple requests that do not need additional context. As models gained reasoning ability, it became clear that they should decide when to fetch data and even when to act.
Figure 3: Enterprise and unique data typically isnt known by LLMs. That's where a RAG Pipeline can fetch data relevant to a user's question for a more grounded, accurate result.
Beyond RAG: The rise of agentic AI
This shift gave rise to tool calling, where an LLM does not just read information but performs actions through APIs. Instead of just searching a document store, a model can now access external services to retrieve or modify data. For example, when a user asks "What is the weather today?", the model can call the AccuWeather API to fetch the current forecast, interpret the response, and return a clear summary. In the same way, it might open a pull request in GitHub or check the health of a Kubernetes cluster and post a summary of any issues to Slack.
Figure 4 illustrates this evolution. It shows how the model identifies available tools, selects the right one for the task, and performs the action. The LLM becomes context-aware, understanding both the question and the systems it can interact with. This marks the transition from simple text generation to agentic AI, systems that can reason, plan, and act on behalf of the user.
Figure 4: Agentic AI really just means AI systems that can take action. Models reason what decision need to be taken, then use custom instructions to call an API or service.
However, this new flexibility introduced new challenges. Every organization implemented tool calling differently, writing custom code for each integration. This resulted in duplication, fragmentation, and a lack of shared standards. MCP addresses these issues by defining one consistent way for models to communicate with tools.
Introducing the Model Context Protocol (MCP)
MCP defines a clear way for models to call external tools. Instead of hard-coding logic for every service, you register an MCP server that exposes an interface the model can understand.
A client, typically an LLM or an application using one, sends a standardized request. The server executes the action against a system like GitHub, Slack, or Kubernetes. (These systems already have public servers available for use). The MCP layer coordinates communication between them. See Figure 5.
Figure 5: Instead of each team building their own unique communication protocol between the LLM and various APIs, Model Context Protocol is a standard for clear, defined integration.
Think of MCP as a common language for model-to-tool interaction, much like TCP standardized network communication. Anthropic introduced the protocol in late 2024, and the ecosystem has expanded rapidly since then. There are now tens of thousands of community-built MCP servers that connect to data storage systems, communication platforms, productivity apps, and infrastructure tools. Clients such as Claude Desktop, ChatGPT, and many open source applications can register these servers and call them directly.
Making MCP enterprise-ready
The promise of MCP is clear, but for large companies to use it, they need more than just compatibility. They need trust, governance, and control.
These needs shape Red Hat's approach to MCP within Red Hat OpenShift AI. Red Hat is extending the platform to include built-in identity and access management through role-based access control (RBAC) and OAuth, lifecycle and metadata management for every MCP server, and enhanced observability as a core capability. With these controls in place, MCP becomes more useful, safe, and ready for production environments.
MCP on Red Hat AI
OpenShift AI 3 adds MCP support directly to the platform, giving AI engineers a fast way to connect models to tools. A guided creation flow walks users through spinning up an MCP server, containerizing it, and deploying it safely.
Once deployed, the Playground (Figure 6) allows teams to experiment safely. They can run a server alongside a model, observe its behavior, and fine-tune it before moving to production. Approved MCP servers then appear in the AI Assets listing, a centralized catalog where engineers can discover, reuse, and extend trusted integrations.
Figure 6: OpenShift AI's asset playground allows teams to select, stage, and deploy MCP Servers.
Because this capability integrates with Llama Stack, models can invoke MCP tools easily during inference. The result is an environment designed for instant experimentation today and certified, governed usage tomorrow.
Red Hat's end-to-end MCP lifecycle
Red Hat's vision for MCP in OpenShift AI extends beyond experimentation. It includes a complete lifecycle for developing, certifying, and running MCP servers.
Everything begins with the registry, a secure staging area where new servers are scanned, quarantined if needed, and enriched with metadata. After they are validated, servers move into the catalog, a curated set of certified integrations available to all AI engineers within the organization.
Lifecycle management ensures that each server is tracked and versioned. Updates are scanned and signed automatically, while deprecated versions are retired safely. At runtime, the MCP gateway ties it all together. It enforces policy and RBAC rules, applies rate limits and logging, and provides consistent observability for every request.
Together, these components make MCP predictable and governable at scale.
A quick demo of Model Context Protocol
To see MCP in action, check out the following video demonstration. Through a chat interface (Goose in this example), the model connects to both OpenShift and Slack MCP servers. It retrieves system information, detects anomalies, and posts updates to Slack.
See how inference and tool calling work together to power real-time agentic applications with various enterprise systems.
The road ahead for MCP in OpenShift AI
In the near term, OpenShift AI 3.0 and future versions will add more support. MCP will appear in the AI Assets catalog and Playground to simplify experimentation (Figure 7). Red Hat will also onboard partner and ISV MCP servers for early access while previewing the full MCP registry, catalog, and gateway stack. Each release will add richer metadata for certification status, security scans, and maturity levels.
Figure 7: The Playground, where teams can experiment with MCP servers in a sandboxed environment.
Beyond that, Red Hat is building a longer-term vision for MCP-as-a-Service (MCPaaS), a managed layer for hosting, observing, and auditing MCP servers centrally. The AI Hub will unify models, agents, guardrails, and MCP servers as composable AI assets. The OpenShift DevHub plug-in will show developers which servers they can use directly within their workspace, and MCP for AI Coders will provide access in Red Hat OpenShift Dev Spaces or local environments.
As you've learned about the brief history of LLM application design, MCP is a major shift in how we connect our AI models to existing systems. It lets developers build models that act with context, enforce policy, and integrate safely into enterprise environments.
By enabling Model Context Protocol into OpenShift AI, Red Hat is giving teams a complete foundation for agentic AI, one where experimentation, governance, and scalability coexist. Developers can test an MCP server in the Playground, publish it to the catalog, and deploy it with confidence.
As the ecosystem grows, expect MCP to become as fundamental to AI development as containers are to cloud infrastructure, a standard layer that makes intelligent automation predictable, secure, and reusable.