How MCP Works: A Look Under the Hood (Client-Server, Discovery & Tools)

In our previous post, we introduced the Model Context Protocol (MCP) as a universal standard designed to bridge AI agents and external tools or data sources. MCP promises interoperability, modularity, and scalability. This helps solve the long-standing issue of integrating AI systems with complex infrastructures in a standardized way. But how does MCP actually work?

Now, let's peek under the hood to understand its technical foundations. This article will focus on the layers and examine the architecture, communication mechanisms, discovery model, and tool execution flow that make MCP a powerful enabler for modern AI systems. Whether you're building agent-based systems or integrating AI into enterprise tools, understanding MCP's internals will help you leverage it more effectively.

TL:DR: How MCP Works

MCP follows a client-server model that enables AI systems to use external tools and data. Here's a step-by-step overview of how it works:

1. Initialization
When the Host application starts (for example, a developer assistant or data analysis tool), it launches one or more MCP Clients. Each Client connects to its Server, and they exchange information about supported features and protocol versions through a handshake.

2. Discovery
The Clients ask the Servers what they can do. Servers respond with a list of available capabilities, which may include tools (like fetch_calendar_events), resources (like user profiles), or prompts (like report templates).

3. Context Provision
The Host application processes the discovered tools and resources. It can present prompts directly to the user or convert tools into a format the language model can understand, such as JSON function calls.

4. Invocation
When the language model decides a tool is needed—based on a user query like “What meetings do I have tomorrow?”; the Host directs the relevant Client to send a request to the Server.

5. Execution
The Server receives the request (for example, get_upcoming_meetings), performs the necessary operations (such as calling a calendar API), and gathers the results.

6. Response
The Server sends the results back to the Client.

7. Completion
The Client passes the result to the Host. The Host integrates the new information into the language model’s context, allowing it to respond to the user with accurate, real-time data.

MCP’s Client-Server Architecture 

At the heart of MCP is a client-server architecture. It is a design choice that offers clear separation of concerns, scalability, and flexibility. MCP provides a structured, bi-directional protocol that facilitates communication between AI agents (clients) and capability providers (servers). This architecture enables users to integrate AI capabilities across applications while maintaining clear security boundaries and isolating concerns.

MCP Hosts

These are applications (like Claude Desktop or AI-driven IDEs) needing access to external data or tools. The host application:

  • Creates and manages multiple client instances
  • Handles connection permissions and consent management
  • Coordinates session lifecycle and context aggregation
  • Acts as a gatekeeper, enforcing security policies

For example, In Claude Desktop, the host might manage several clients simultaneously, each connecting to a different MCP server such as a document retriever, a local database, or a project management tool.

MCP Clients

MCP Clients are AI agents or applications seeking to use external tools or retrieve contextually relevant data. Each client:

  • Connects 1:1 with an MCP server
  • Maintains an isolated, stateful session
  • Negotiates capabilities and protocol versions
  • Routes requests and responses
  • Subscribes to notifications and updates

An MCP client is built using the protocol’s standardized interfaces, making it plug-and-play across a variety of servers. Once compatible, it can invoke tools, access shared resources, and use contextual prompts, without custom code or hardwired integrations.

MCP Servers

MCP Servers expose functionality to clients via standardized interfaces. They act as intermediaries to local or remote systems, offering structured access to tools, resources, and prompts. Each MCP server:

  • Exposes tools, resources, and prompts as primitives
  • Runs independently, either as a local subprocess or a remote HTTP service
  • Processes tool invocations securely and returns structured results
  • Respects all client-defined security constraints and policies

Servers can wrap local file systems, cloud APIs, databases, or enterprise apps like Salesforce or Git. Once developed, an MCP server is reusable across clients, dramatically reducing the need for custom integrations (solving the “N × M” problem).

Local Data Sources: Files, databases, or services securely accessed by MCP servers

Remote Services: External internet-based APIs or services accessed by MCP servers

Communication Protocol: JSON-RPC 2.0

MCP uses JSON-RPC 2.0, a stateless, lightweight remote procedure call protocol over JSON. Inspired by its use in the Language Server Protocol (LSP), JSON-RPC provides:

  • Minimal overhead for real-time communication
  • Human-readable, JSON-based message formats
  • Easy-to-debug, versioned interactions between systems

Message Types

  • Request: Sent by clients to invoke a tool or query available resources.
  • Response: Sent by servers to return results or confirmations.
  • Notification: Sent either way to indicate state changes without requiring a response.

The MCP protocol acts as the communication layer between these two components, standardising how requests and responses are structured and exchanged. This separation offers several benefits, as it allows:

  • Seamless Integration: Clients can connect to a wide range of servers without needing to know the specifics of each underlying system.
  • Reusability: Server developers can build integrations once and have them accessible to many different client applications.
  • Separation of Concerns: Different teams can focus on building client applications or server integrations independently. For example, an infrastructure team can manage an MCP server for a vector database, which can then be easily used by various AI application development teams.

Request Format

When an AI agent decides to use an external capability, it constructs a structured request:

{

  "jsonrpc": "2.0",

  "method": "call_tool",

  "params": {

    "tool_name": "search_knowledge_base",

    "inputs": {"query": "latest sales figures"}

  },

  "id": 1

}

Server Response

The server validates the request, executes the tool, and sends back a structured result, which may include output data or an error message if something goes wrong.

This communication model is inspired by the Language Server Protocol (LSP) used in IDEs, which also connects clients to analysis tools.

Dynamic Discovery: How AI Learns What It Can Do

A key innovation in MCP is dynamic discovery. When a client connects to a server, it doesn't rely on hardcoded tool definitions. It allows clients to understand the capabilities of any server they connect to. It enables:

Initial Handshake: When a client connects to an MCP server, it initiates an initial handshake to query the server’s exposed capabilities. It goes beyond relying on pre-defined knowledge of what a server can do. The client dynamically discovers tools, resources, and prompts made available by the server. For instance, it asks the server: “What tools, resources, or prompts do you offer?”

{

  "jsonrpc": "2.0",

  "method": "discover_capabilities",

  "id": 2

}

Server Response: Capability Catalog

The server replies with a structured list of available primitives:

  • Tools
    These are executable functions that the AI model can invoke. Examples include search_database, send_email, or generate_report. Each tool is described using metadata that defines input parameters, expected output types, and operational constraints. This enables models to reason about how to use each tool correctly.

  • Resources
    Resources represent contextual data the AI might need to access—such as database schemas, file contents, or user configurations. Each resource is uniquely identified via a URI and can be fetched or subscribed to. This allows models to build awareness of their operational context.

  • Prompts
    These are predefined interaction templates that can be reused or parameterized. Prompts help standardize interactions with users or other systems, allowing AI models to retrieve and customize structured messaging flows for various tasks.

This discovery process allows AI agents to learn what they can do on the fly, enabling plug-and-play style integration 

This approach to capability discovery provides several significant advantages:

  • Zero Manual Setup: Clients don’t need to be pre-configured with knowledge of server tools.
  • Simplified Development: Developers don’t need to engineer complex prompt scaffolding for each tool.
  • Future-Proofing: Servers can evolve, adding new tools or modifying existing ones, without requiring updates to client applications.
  • Runtime Adaptability: AI agents can adapt their behavior based on the capabilities of each connected server, making them more intelligent and autonomous.

Structured Tool Execution: How AI Invokes and Uses Capabilities

Once the AI client has discovered the server’s available capabilities, the next step is execution. This involves using those tools securely, reliably, and interpretably. The lifecycle of tool execution in MCP follows a well-defined, structured flow:

  1. Decision Point
    The AI model, during its reasoning process, identifies the need to use an external capability (e.g., “I need to query a sales database”).
  2. Request Construction
    The MCP client constructs a structured JSON-RPC request to invoke the desired tool, including the tool name and any necessary input arguments.
  3. Routing and Validation
    The request is routed to the appropriate MCP server. The server validates the input, applies any relevant access control policies, and ensures the requested tool is available and safe to execute.
  4. Execution
    The server executes the tool logic; whether it’s querying a database, making an API call, or performing a computation.
  5. Response Handling
    The server returns a structured result, which could be data, a confirmation message, or an error report. The client then passes this response back to the AI model for further reasoning or user-facing output.

This flow ensures execution is secure, auditable, and interpretable, unlike ad-hoc integrations where tools are invoked via custom scripts or middleware. MCP’s structured approach provides:

  • Security: Tool usage is sandboxed and constrained by the client-server boundary and policy enforcement.
  • Auditability: Every tool call is traceable, making it easy to debug, monitor, and govern AI behavior.
  • Reliability: Clear schema definitions reduce the chance of malformed inputs or unexpected failures.
  • Model-to-Model Coordination: Structured messages can be interpreted and passed between AI agents, enabling collaborative workflows.

Server Modes: Local (stdio) vs. Remote (HTTP/SSE)

MCP Servers are the bridge/API between the MCP world and the specific functionality of an external system (an API, a database, local files, etc.). Servers communicate with clients primarily via two methods:

Local (stdio) Mode

  • The server is launched as a local subprocess
  • Communication happens over stdin/stdout
  • Ideal for local tools like:
    • File systems
    • Local databases
    • Scripted automation tasks

Remote (http) Mode

  • The server runs as a remote web service
  • Communicates using Server-Sent Events (SSE) and HTTP
  • Best suited for:
    • Cloud-based APIs
    • Shared enterprise systems
    • Scalable backend services

Regardless of the mode, the client’s logic remains unchanged. This abstraction allows developers to build and deploy tools with ease, choosing the right mode for their operational needs.

Decoupling Intent from Implementation

One of the most elegant design principles behind MCP is decoupling AI intent from implementation. In traditional architectures, an AI agent needed custom logic or prompts to interact with every external tool. MCP breaks this paradigm:

  • Client expresses intent: “I want to use this tool with these inputs.”
  • Server handles implementation: Executes the action securely and returns the result.

This separation unlocks huge benefits:

  • Portability: The same AI agent can work with any compliant server
  • Security: Tool execution is sandboxed and auditable
  • Maintainability: Backend systems can evolve without affecting AI agents
  • Scalability: New tools can be added rapidly without client-side changes

Conclusion

The Model Context Protocol is more than a technical standard, it's a new way of thinking about how AI interacts with the world. By defining a structured, extensible, and secure protocol for connecting AI agents to external tools and data, MCP lays the foundation for building modular, interoperable, and scalable AI systems.

Key takeaways:

  • MCP uses a client-server architecture inspired by LSP
  • JSON-RPC 2.0 enables structured, reliable communication
  • Dynamic discovery makes tools plug-and-play
  • Tool invocations are secure and verifiable
  • Servers can run locally or remotely with no protocol changes
  • Intent and implementation are cleanly decoupled

As the ecosystem around AI agents continues to grow, protocols like MCP will be essential to manage complexity, ensure security, and unlock new capabilities. Whether you're building AI-enhanced developer tools, enterprise assistants, or creative AI applications, understanding how MCP works under the hood is your first step toward building robust, future-ready systems.

Next Steps:

FAQs

1. What’s the difference between a host, client, and server in MCP? 

  • A host runs and manages multiple AI agents (clients), handling permissions and context.
  • A client is the AI entity that requests capabilities.
  • A server provides access to tools, resources, and prompts.

2. Can one AI client connect to multiple servers?
Yes, a single MCP client can connect to multiple servers, each offering different tools or services. This allows AI agents to function more effectively across domains. For example, a project manager agent could simultaneously use one server to access project management tools (like Jira or Trello) and another server to query internal documentation or databases.

3. Why does MCP use JSON-RPC instead of REST or GraphQL?
JSON-RPC was chosen because it supports lightweight, bi-directional communication with minimal overhead. Unlike REST or GraphQL, which are designed around request-response paradigms, JSON-RPC allows both sides (client and server) to send notifications or make calls, which fits better with the way LLMs invoke tools dynamically and asynchronously. It also makes serialization of function calls cleaner, especially when handling structured input/output.

4. How does dynamic discovery improve developer experience?
With MCP’s dynamic discovery model, clients don’t need pre-coded knowledge of tools or prompts. At runtime, clients query servers to fetch a list of available capabilities along with their metadata. This removes boilerplate setup and enables developers to plug in new tools or update functionality without changing client-side logic. It also encourages a more modular and composable system architecture.

5. How is tool execution kept secure and reliable in MCP?
Tool invocations in MCP are gated by multiple layers of control:

  • Boundaries: Clients and servers are separate processes or services, allowing strict boundary enforcement.
  • Validation: Each request is validated for correct parameters and permissions before execution.
  • Access policies: The Host can define which clients have access to which tools, ensuring misuse is prevented.
  • Auditing: Every tool call is logged, enabling traceability and accountability—important for enterprise use cases.

6. How is versioning handled in MCP?
Versioning is built into the handshake process. When a client connects to a server, both sides exchange metadata that includes supported protocol versions, capability versions, and other compatibility information. This ensures that even as tools evolve, clients can gracefully degrade or adapt, allowing continuous deployment without breaking compatibility.

7. Can MCP be used across different AI models or agents?
Yes. MCP is designed to be model-agnostic. Any AI model—whether it’s a proprietary LLM, open-source foundation model, or a fine-tuned transformer—can act as a client if it can construct and interpret JSON-RPC messages. This makes MCP a flexible framework for building hybrid agents or systems that integrate multiple AI backends.

8. How does error handling work in MCP?
Errors are communicated through structured JSON-RPC error responses. These include a standard error code, a message, and optional data for debugging. The Host or client can log, retry, or escalate errors depending on the severity and the use case—helping maintain robustness in production systems.

#1 in Ease of Integrations

Trusted by businesses to streamline and simplify integrations seamlessly with GetKnit.