Local MCP Server
At Stainless, we generate client SDKs in various languages for our customers' APIs. Recently, those same customers started asking us for something new: a Model Context Protocol (MCP) server that wraps their API and exposes it as a set of tools for LLMs.
This guide explains what a local MCP server is, how it differs from remote servers, and how to build one from scratch.
What is a local MCP server
A local MCP (Model Context Protocol) server is a program that runs on your computer and connects AI models to tools and data on your machine. Unlike remote servers that operate in the cloud, local MCP servers run on the same device as your AI application.
The Model Context Protocol defines how AI models can request actions from external tools. When you run an MCP server locally, it gives AI models a way to interact with files, applications, and services on your computer without sending data to external servers.
Local MCP servers typically communicate through standard input/output (stdio) rather than HTTP requests. This direct communication method is faster and more secure for local operations.
Here's why people use local MCP servers:
Privacy: Your data stays on your computer
Development: Easier to test and debug
Control: You decide which tools the AI can access
Offline use: Works without internet connection
Local MCP server vs remote MCP server
The main difference between local and remote MCP servers is where they run and how they communicate.
Feature | Local MCP Server | Remote MCP Server |
---|---|---|
Location | Your computer | Cloud or external server |
Communication | stdio (standard input/output) | HTTP/JSON-RPC |
Setup | Manual installation | Service registration |
Access | Only from your device | Any authorized client |
Best for | Development, privacy, testing | Production, sharing, scaling |
Local servers are ideal when you need to work with sensitive data or want complete control over the server environment. Remote servers make more sense when you need to share tools across multiple users or integrate with cloud services.
Claude Desktop, for example, can connect to both local and remote MCP servers. It uses a configuration file to locate and launch local servers, while connecting to remote servers through HTTP endpoints.
How to set up a local MCP server
Setting up a local MCP server involves a few key steps:
Choose a programming language (Python, TypeScript, etc.)
Install the MCP SDK for that language
Define your tools and their schemas
Configure the server transport (stdio for local servers)
Register the server with your MCP client
Here's a simple example using Python:
This simple server exposes a single tool called get_weather
that takes a city name and returns a weather description.
Configuring your MCP client
After creating your server, you need to tell your MCP client (like Claude Desktop) how to find and launch it. This is done through a configuration file.
For Claude Desktop, this file is located at:
Mac:
~/Library/Application Support/Claude/claude_desktop_config.json
Windows:
%AppData%\Claude\claude_desktop_config.json
Here's an example configuration:
This tells Claude Desktop to run your server using Python when it needs to access weather information. The server will appear in Claude's interface as a tool that can be used during conversations.
Converting OpenAPI specs to MCP tools
While both MCP tools and OpenAPI specs use JSON Schema, you can't just copy them over as-is. You have to combine the request body, path parameters, query parameters, and header parameters all into one schema, and handle any naming collisions automatically.
For example, an OpenAPI endpoint like:
Would need to be converted to an MCP tool schema like:
This transformation process needs to handle several challenges:
Combining parameters from different locations
Resolving naming conflicts
Converting path templates to parameter names
Preserving required field constraints
Handling $refs and recursive references
OpenAPI schemas use $ref
to point to chunks of reusable schema elsewhere in the file. However, MCP tool schemas must be completely self-contained, meaning they cannot reference anything outside themselves.
This means you need to resolve all references by replacing each $ref
with the actual schema it points to. For example:
Must become:
Recursive references (like a User that contains an array of Users) need special handling to avoid infinite recursion during resolution.
Managing large APIs
Many APIs have too many endpoints to be imported at once. MCP currently works by loading all of the tool schemas for an MCP server at once into its context. The LLM then chooses which tool to use based on your request.
For APIs with hundreds of endpoints, this creates challenges:
Context limits: LLMs have fixed context windows
Tool selection: Too many tools make selection difficult
Performance: Loading all tools slows down the system
To address this, you can:
Create multiple smaller MCP servers that each handle a subset of your API
Build a dynamic tool selection system that only loads relevant tools
Use a proxy pattern where a lightweight tool helps select more specific tools
At Stainless, we've experimented with all three approaches and found that a combination works best for most APIs.
Testing your Local MCP server
Once your server is running, you can test it using an MCP client like Claude Desktop. Here's how:
Start a conversation with Claude
Ask a question that would require your tool
Watch for Claude to request permission to use your tool
Check if the tool returns the expected results
If something goes wrong, check:
Server logs for errors
Configuration file for correct paths
Tool schemas for validation issues
The most common issues are path configuration problems and schema validation errors.
Best practices for local MCP servers
From our experience building MCP servers at Stainless, we've learned several best practices:
Clear descriptions: Write detailed descriptions for each tool
Simple schemas: Keep input/output schemas as simple as possible
Error handling: Return helpful error messages when tools fail
Logging: Add logging to track tool usage and debug issues
Security: Validate inputs and limit access to sensitive operations
Remember that the AI model will use your tool based on its description and schema, so clarity is essential for correct usage.
Building local MCP server
After building a basic local MCP server, you might want to:
Add more tools to expand functionality
Connect to external APIs for more capabilities
Share your server with others via GitHub
Convert existing OpenAPI specs to MCP tools
At Stainless, we're working on tools to automatically generate MCP servers from OpenAPI specifications, making it easier to expose existing APIs as tools for LLMs.
FAQs about local MCP servers
How do local MCP servers communicate with AI models?
Local MCP servers typically use standard input/output (stdio) to communicate with the host process. The host process acts as a bridge between the AI model and the server, passing requests and responses back and forth. This direct communication method keeps all data on your local machine.
Can I convert my existing API to an MCP server?
Yes, existing APIs with OpenAPI specifications can be converted to MCP servers. The process involves transforming endpoint definitions into tool schemas, resolving references, and setting up the server to handle tool requests. Several open-source libraries and tools like Stainless can help automate this conversion.
What are the security considerations for local MCP servers?
Local MCP servers run with the same permissions as the user who launched them, so they can access any files or services that user can access. Consider implementing additional validation, input sanitization, and access controls to prevent unintended actions, especially when the server exposes system operations or sensitive data.