Remote MCP server
At Stainless, we generate client SDKs in various languages for our customers' APIs. Recently, those same customers started asking us for something new: a Model Context Protocol (MCP) server that wraps their API and exposes it as a set of tools for LLMs.
This article explains what a remote MCP server is, how it works, and how to build one that can scale with your needs.
What is a remote MCP server?
Model Context Protocol (MCP) is a specification that defines how AI models can interact with external tools. These tools are exposed as HTTP endpoints with JSON schemas that describe what parameters they accept and what they return.
A remote MCP server hosts one or more MCP-compatible tools over the internet. Unlike local MCP servers that run on a user's machine, remote servers are deployed to accessible URLs. This setup allows tools to be shared across multiple users and applications.
MCP architecture includes three main components:
The host: The environment where the AI model runs (like Anthropic or OpenAI)
The client: The runtime that manages connections between hosts and servers
The server: The component that exposes tools the model can use
Companies like GitHub, Cloudflare, and Zapier have built remote MCP servers to make their services available to AI models. These servers centralize tool access and make it easier to apply authentication and monitoring in one place.
Why scalability matters for MCP servers
Remote MCP servers receive tool requests from AI clients, often in parallel. Unlike traditional APIs with predictable patterns, AI agents might call multiple tools in sequence or trigger batch operations unexpectedly.
When your MCP server handles complex operations—like file processing or multi-step workflows—each request uses more CPU, memory, and network resources. If multiple agents call the same tool at once, it can cause delays or timeouts.
For example, a customer support AI might need to check order status, look up shipping details, and update customer records all within seconds. Each of these operations requires a separate tool call, and each call needs to complete quickly to maintain a smooth conversation.
Adding more tools to your server increases its memory footprint. If an API has 200 endpoints exposed through MCP, the server needs to manage all these definitions unless they're dynamically filtered.
Key steps to deploy a remote MCP server
1. Set up your container environment
Containers package your MCP server with everything it needs to run consistently. Using Docker makes deployment simpler:
Build and run the container locally to test:
This exposes your MCP server on port 8787, ready for local testing before deployment.
2. Configure OAuth and secrets
Remote MCP servers typically need authentication. OAuth 2.0 lets clients request access tokens to interact with tools securely.
Register your application with an identity provider like GitHub, then store the client ID and secret securely:
Never commit these values to version control. Instead, use environment variables or a secret management service.
3. Deploy to a hosting provider
Popular options for hosting remote MCP servers include:
Provider | Pros | Cons |
---|---|---|
Cloudflare Workers | Low latency, auto-scaling | Limited runtime |
AWS Lambda | Custom runtimes, integration with AWS services | Cold starts |
Traditional VMs | Full control | Manual scaling |
Cloudflare offers templates specifically for MCP servers, making it one of the easiest options to get started with.
4. Test connectivity
After deployment, test your server using an MCP-compatible client. The MCP Inspector tool works well for this:
Open the Inspector in your browser and connect to your server URL. If you can list tools and execute them, your server is working correctly.
Managing large OpenAPI specs
1. Resolve references for MCP compatibility
OpenAPI specs often use $ref
to point to reusable components. However, MCP tool schemas must be self-contained with no external references.
Here's how a reference gets resolved:
Original schema:
Resolved schema:
This process, called inlining, makes the schema compatible with MCP clients but can increase its size significantly.
2. Split endpoints by context
Large APIs often have hundreds of endpoints. Grouping them into logical sets makes them easier to manage as MCP tools:
Resource-based groups:
customers
,orders
,products
Function-based groups:
reporting
,billing
,authentication
Role-based groups:
adminTools
,userTools
,supportTools
Each group can be exposed as a separate MCP tool or set of tools. This approach simplifies tool selection and reduces the size of each schema.
3. Handle parameter collisions
When combining parameters from different parts of an API endpoint (path, query, headers, body), name conflicts can occur. For example, both the path and query might use a parameter called id
.
To avoid confusion:
Prefix parameters based on their source (
path_id
,query_id
)Use fully qualified names for nested objects
Apply consistent naming conventions across all tools
This clarity helps both the AI model and developers understand what each parameter does.
Popular remote MCP server integrations
Many companies now offer remote MCP servers that expose their services as tools for AI models:
Anthropic and OpenAI
Both Anthropic and OpenAI support the MCP protocol, allowing their models (Claude and GPT) to interact with remote servers. Anthropic uses the MCP connector to establish these connections.
GitHub and Cloudflare
GitHub's remote MCP server lets AI clients access repository data like issues, pull requests, and code files. It supports both OAuth and personal access tokens for authentication.
Cloudflare provides templates and infrastructure for hosting your own MCP servers. Their Workers platform makes it easy to deploy and scale servers without managing traditional infrastructure.
E-commerce and communication tools
Several platforms offer specialized tools through MCP:
Stripe: Payment processing, subscription management
Shopify: Product catalog, order management
Twilio: Messaging, voice calls, programmable workflows
Square: Point-of-sale, inventory tracking
Automation and support Platforms
Zapier: Connects to thousands of other services
Plaid: Financial account access and analysis
Intercom: Customer conversations and support tickets
Workato: Enterprise workflow automation
Each integration requires proper authentication and schema definition to work correctly with AI models.
Scaling your MCP server
As your MCP server usage grows, you'll need strategies to handle increased load:
Monitor performance metrics: Track response times, request volume, and error rates to identify bottlenecks.
Implement caching: Cache frequently used tool results to reduce processing time and API calls.
Use horizontal scaling: Run multiple instances of your server behind a load balancer to distribute traffic.
Apply rate limiting: Prevent abuse by limiting how often clients can call your tools.
Optimize schemas: Keep tool definitions concise and well-structured to reduce parsing overhead.
The quality of your API's SDKs directly affects how well your MCP server performs. At Stainless, we generate high-quality SDKs from OpenAPI specifications, which helps ensure consistent and reliable MCP server behavior.
Ready to build your own remote MCP server? Get started with Stainless to ensure your implementation meets enterprise standards.
FAQs about remote MCP servers
What's the difference between local and remote MCP servers?
A remote MCP server runs on a separate infrastructure accessible via HTTP, while local MCP servers run directly within the client application environment. Remote servers are better for sharing tools across multiple users and applications.
How do I handle authentication for my remote MCP server?
Most production MCP servers implement OAuth 2.0 for authentication, which allows secure delegation of access while protecting both server resources and client credentials.
Can I convert my existing API to a remote MCP server?
Yes, existing APIs can be wrapped as MCP servers by creating JSON schema definitions for each endpoint and implementing the required MCP protocol endpoints.
How many endpoints can a single MCP server support?
There's no hard technical limit, but practical considerations around context window size and performance typically limit servers to dozens or hundreds of well-organized endpoints.
Which cloud providers work best for hosting remote MCP servers?
Cloudflare, AWS, Azure, and Google Cloud all provide suitable hosting options. Cloudflare offers specific MCP server templates, while the others support container-based or serverless deployments.