MCP TypeScript SDK for Building Clients and Servers

MCP TypeScript SDK simplifies building MCP clients and servers with full protocol support, automatic type generation, and seamless client-server integration.

Jump to section

Jump to section

Jump to section

This guide covers the architectural patterns, security considerations, and operational strategies you need to deploy MCP servers at enterprise scale. You'll learn how to structure TypeScript implementations, manage authentication flows, handle large API surfaces, and integrate MCP server deployment into your existing CI/CD workflows.
An MCP server, often built with a TypeScript SDK, acts as a universal adapter between your API and large language models. You can generate an MCP server from an OpenAPI spec to expose your enterprise API to AI agents with a single configuration and build process, letting you focus on core logic instead of boilerplate.

Why MCP matters for enterprise API teams

The Model Context Protocol, or MCP, provides a standardized way for applications like AI agents to discover and interact with your API's capabilities. Think of it like the Language Server Protocol (LSP) for APIs; instead of every editor implementing its own support for every language, LSP provides a single interface. Similarly, MCP allows any AI agent to connect to any API that exposes an MCP server, eliminating the need for bespoke, one-off integrations for every model and platform.

This approach solves a major headache for engineering teams. Without a standard, you're stuck building and maintaining a fragmented mess of plugins, custom ReAct chains, and vendor-specific API wrappers. MCP replaces that integration sprawl with a single, stable, and discoverable layer that you control.

Solve AI integration sprawl

Instead of building a separate plugin for ChatGPT, a custom tool for an internal agent, and another integration for Claude, you build one MCP server. This single implementation serves all current and future MCP-compatible clients, drastically reducing maintenance overhead and ensuring a consistent experience.

Apply lsp lessons to APIs

The Language Server Protocol succeeded because it allowed tools to be designed at the right altitude, separating the language's logic from the editor's UI. MCP applies the same principle to AI. It separates your API's core business logic from the AI agent's reasoning, allowing each to evolve independently while communicating through a stable, well-defined protocol.

Architect MCP servers with TypeScript

At its core, an MCP server built with TypeScript consists of a few key components. It needs a transport layer to communicate, a set of tool schemas to define its capabilities, and the logic to handle incoming requests. Creating an MCP server from an OpenAPI spec automates the tedious parts, like type generation, and gives you a clean structure to add custom business logic.

Choose transport layer

Your server needs a way to talk to the client. MCP supports two primary transport mechanisms for this.

  • Stdio: Standard input/output is perfect for local development. When you run the MCP server as a child process from a desktop client like Claude Desktop or Cursor, stdio provides a fast and simple communication channel.

  • Streamable http: For production and remote scenarios, you'll use HTTP. This allows web-based clients like claude.ai to securely connect to your server over the internet, typically involving an OAuth flow for authentication.

Define tool schemas

Each function you expose to an LLM is a tool, and each tool needs a schema defining its inputs. These are typically defined using JSON Schema. A robust generator can create these schemas directly from your OpenAPI endpoint definitions, mapping path parameters, query parameters, and request bodies into a single, coherent input object for the LLM.

Share code between SDK and server

A common pattern is to manage your TypeScript SDK and MCP server within a single monorepo. The generated MCP server can live in a sub-directory like packages/mcp-server. This structure makes it easy to share code, such as type definitions and helper functions, between your main SDK and the MCP server, ensuring consistency and reducing code duplication while maintaining custom code that persists through regenerated code.

Expose endpoints with confidence

Exposing your entire API surface to an LLM isn't always the best strategy. It can overwhelm the model's context window and introduce security risks. A more deliberate approach involves selectively exposing endpoints and grouping them logically.

You can start by setting enable_all_resources: false in your configuration and then explicitly opting in specific resources or endpoints with mcp: true. This gives you fine-grained control over what the AI agent can see and do.

Group endpoints by risk

A powerful pattern is to tag your endpoints to create logical groupings that users can filter. This allows for a least-privilege approach where a user might only enable read-only or low-risk tools.

You can create these groups using filters based on:

  • Operation: Allow users to enable only read (GET/LIST) or write (POST/PUT/DELETE) operations.

  • Tag: Define custom tags in your config (e.g., invoicing, admin) that users can selectively enable with a --tag flag.

  • Scope: In a remote server, you can dynamically filter the tool list based on the user's authenticated OAuth scopes.

Override tool metadata

The default names and descriptions generated from your OpenAPI spec might not be ideal for an LLM. You can override the tool_name and description for any endpoint to provide clearer, more concise instructions, which significantly improves the model's ability to choose the right tool for a given task.

# stainless.yml
resources:
  payments:
    methods:
      create:
        mcp:
          tool_name: initialize_payment
          description: |
            Starts a new payment transaction. Does not complete it.

Secure MCP servers with enterprise auth

For local development, passing an API key via an environment variable is simple and effective. For production, especially with remote servers accessed by web clients, you must use a more secure and flexible authentication method like OAuth 2.0.

Implement Oauth flow

A remote MCP server needs to handle an OAuth redirect flow to authenticate users. A generated Cloudflare Worker template can provide this out-of-the-box, including a consent screen to collect the necessary API keys or tokens from the user. This worker can be customized to integrate with your existing identity providers like Google or GitHub, or to implement a fully custom authentication flow.

Enforce role based access

Security isn't just about authentication; it's also about authorization. Once a user is authenticated, your MCP server should enforce access controls.

Access Control Method

Description

Static Filtering

Users can use command-line flags like --resource or --tag to limit the tools loaded into their local client.

Dynamic Scopes

In a remote server, you can inspect the user's OAuth token and its associated scopes during the tools/list request.

Per-User Tool Lists

Based on the user's identity, you can return a different set of available tools, effectively enforcing role-based access control (RBAC).

Invocation Logging

Every tools/call invocation should be logged with the user's identity for auditing and security monitoring.

Operate large APIs at scale

LLMs have a finite context window. If your API has hundreds of endpoints, exposing them all as individual tools will quickly exhaust this limit, leading to poor performance or outright failure. You need strategies to manage this complexity.

Enable dynamic tools mode

Instead of exposing hundreds of tools, you can expose just three powerful meta-tools that allow the LLM to discover endpoints at runtime. This dynamic tools mode is enabled with a simple --tools=dynamic flag.

  1. list_api_endpoints: Lets the model search for relevant endpoints (e.g., "what can I do with users?").

  2. get_api_endpoint_schema: Lets the model retrieve the specific input schema for a chosen endpoint.

  3. invoke_api_endpoint: Lets the model execute the endpoint with the required parameters.

This indirect approach keeps the initial context small and allows the LLM to explore your entire API surface on demand.

Adapt schemas to client limits

Different MCP clients and the models behind them have varying levels of support for complex JSON Schema features. For instance, some clients struggle with $ref pointers, union types (anyOf), or have strict limits on tool name length, as detailed in what we learned converting complex OpenAPI specs to MCP servers.

To ensure broad compatibility, your server can adapt its schemas on the fly. By specifying the target client with a --client=cursor or --client=claude flag, the server can automatically apply necessary transformations, such as inlining references or simplifying unions, to match the client's known capabilities.

Deploy and integrate at scale

A robust deployment and versioning strategy is critical for maintaining your MCP server alongside your core API and SDKs. The goal is to create a seamless, automated workflow from code generation to production.

Package server as Docker image

For maximum portability, you can configure your project to automatically build and publish a Docker image for your MCP server. This containerized version can be easily deployed to any environment, from a local developer machine to a cloud-based container orchestration platform. The image is versioned and tagged in lock-step with your npm package.

Run serverless on Cloudflare workers

For remote MCP servers requiring OAuth, deploying to a serverless platform like Cloudflare Workers is an excellent choice. This provides a scalable, low-latency solution with zero cold starts. A generated worker template can handle the entire OAuth flow and serve the MCP tools, giving you a production-ready remote server with minimal setup.

Version SDK and server together

When you make a change to your OpenAPI spec, it should trigger a single, unified release process for both your TypeScript SDK and your MCP server. A CI-driven workflow creates one release pull request that bumps the version for both packages based on conventional commit messages. This ensures that your SDK and MCP server always stay in sync and are published together with a shared changelog.

Ready to generate an MCP server for your API? Get started for free.

Frequently asked questions about enterprise mcp servers

How do I limit tool access per environment?

You can manage environment-specific configurations by editing configs and OpenAPI specs with branches for your stainless.yml file. At runtime, you can then use --tag or --resource filters to expose only the tools appropriate for that environment, such as disabling destructive write operations in production.

Can I route traffic through my API gateway?

Yes, you can deploy your remote MCP server behind an existing API gateway. The gateway can handle initial traffic, and you can configure it to forward authentication headers or tokens to the MCP server, which then uses them to make authenticated calls to your backend API.

What performance overhead should I expect?

The overhead is generally minimal. The main costs are the JSON-RPC message framing and the initial tools/list call to fetch schemas. For subsequent tools/call requests, the latency is dominated by the execution time of your underlying API endpoint.

How do I support multiple API versions?

You can namespace your tool names (e.g., v1_create_user, v2_create_user) within a single MCP server. Alternatively, for major version changes, you can generate and publish entirely separate MCP server packages for each API version.

Should I build or generate my MCP server?

While you can build an MCP server from scratch, generating it from your OpenAPI spec saves significant development time and reduces maintenance. The Stainless SDK generator handles the boilerplate of schema transformation, transport layers, and client capability adaptation, letting you focus on the unique logic of your API.

This guide covers the architectural patterns, security considerations, and operational strategies you need to deploy MCP servers at enterprise scale. You'll learn how to structure TypeScript implementations, manage authentication flows, handle large API surfaces, and integrate MCP server deployment into your existing CI/CD workflows.
An MCP server, often built with a TypeScript SDK, acts as a universal adapter between your API and large language models. You can generate an MCP server from an OpenAPI spec to expose your enterprise API to AI agents with a single configuration and build process, letting you focus on core logic instead of boilerplate.

Why MCP matters for enterprise API teams

The Model Context Protocol, or MCP, provides a standardized way for applications like AI agents to discover and interact with your API's capabilities. Think of it like the Language Server Protocol (LSP) for APIs; instead of every editor implementing its own support for every language, LSP provides a single interface. Similarly, MCP allows any AI agent to connect to any API that exposes an MCP server, eliminating the need for bespoke, one-off integrations for every model and platform.

This approach solves a major headache for engineering teams. Without a standard, you're stuck building and maintaining a fragmented mess of plugins, custom ReAct chains, and vendor-specific API wrappers. MCP replaces that integration sprawl with a single, stable, and discoverable layer that you control.

Solve AI integration sprawl

Instead of building a separate plugin for ChatGPT, a custom tool for an internal agent, and another integration for Claude, you build one MCP server. This single implementation serves all current and future MCP-compatible clients, drastically reducing maintenance overhead and ensuring a consistent experience.

Apply lsp lessons to APIs

The Language Server Protocol succeeded because it allowed tools to be designed at the right altitude, separating the language's logic from the editor's UI. MCP applies the same principle to AI. It separates your API's core business logic from the AI agent's reasoning, allowing each to evolve independently while communicating through a stable, well-defined protocol.

Architect MCP servers with TypeScript

At its core, an MCP server built with TypeScript consists of a few key components. It needs a transport layer to communicate, a set of tool schemas to define its capabilities, and the logic to handle incoming requests. Creating an MCP server from an OpenAPI spec automates the tedious parts, like type generation, and gives you a clean structure to add custom business logic.

Choose transport layer

Your server needs a way to talk to the client. MCP supports two primary transport mechanisms for this.

  • Stdio: Standard input/output is perfect for local development. When you run the MCP server as a child process from a desktop client like Claude Desktop or Cursor, stdio provides a fast and simple communication channel.

  • Streamable http: For production and remote scenarios, you'll use HTTP. This allows web-based clients like claude.ai to securely connect to your server over the internet, typically involving an OAuth flow for authentication.

Define tool schemas

Each function you expose to an LLM is a tool, and each tool needs a schema defining its inputs. These are typically defined using JSON Schema. A robust generator can create these schemas directly from your OpenAPI endpoint definitions, mapping path parameters, query parameters, and request bodies into a single, coherent input object for the LLM.

Share code between SDK and server

A common pattern is to manage your TypeScript SDK and MCP server within a single monorepo. The generated MCP server can live in a sub-directory like packages/mcp-server. This structure makes it easy to share code, such as type definitions and helper functions, between your main SDK and the MCP server, ensuring consistency and reducing code duplication while maintaining custom code that persists through regenerated code.

Expose endpoints with confidence

Exposing your entire API surface to an LLM isn't always the best strategy. It can overwhelm the model's context window and introduce security risks. A more deliberate approach involves selectively exposing endpoints and grouping them logically.

You can start by setting enable_all_resources: false in your configuration and then explicitly opting in specific resources or endpoints with mcp: true. This gives you fine-grained control over what the AI agent can see and do.

Group endpoints by risk

A powerful pattern is to tag your endpoints to create logical groupings that users can filter. This allows for a least-privilege approach where a user might only enable read-only or low-risk tools.

You can create these groups using filters based on:

  • Operation: Allow users to enable only read (GET/LIST) or write (POST/PUT/DELETE) operations.

  • Tag: Define custom tags in your config (e.g., invoicing, admin) that users can selectively enable with a --tag flag.

  • Scope: In a remote server, you can dynamically filter the tool list based on the user's authenticated OAuth scopes.

Override tool metadata

The default names and descriptions generated from your OpenAPI spec might not be ideal for an LLM. You can override the tool_name and description for any endpoint to provide clearer, more concise instructions, which significantly improves the model's ability to choose the right tool for a given task.

# stainless.yml
resources:
  payments:
    methods:
      create:
        mcp:
          tool_name: initialize_payment
          description: |
            Starts a new payment transaction. Does not complete it.

Secure MCP servers with enterprise auth

For local development, passing an API key via an environment variable is simple and effective. For production, especially with remote servers accessed by web clients, you must use a more secure and flexible authentication method like OAuth 2.0.

Implement Oauth flow

A remote MCP server needs to handle an OAuth redirect flow to authenticate users. A generated Cloudflare Worker template can provide this out-of-the-box, including a consent screen to collect the necessary API keys or tokens from the user. This worker can be customized to integrate with your existing identity providers like Google or GitHub, or to implement a fully custom authentication flow.

Enforce role based access

Security isn't just about authentication; it's also about authorization. Once a user is authenticated, your MCP server should enforce access controls.

Access Control Method

Description

Static Filtering

Users can use command-line flags like --resource or --tag to limit the tools loaded into their local client.

Dynamic Scopes

In a remote server, you can inspect the user's OAuth token and its associated scopes during the tools/list request.

Per-User Tool Lists

Based on the user's identity, you can return a different set of available tools, effectively enforcing role-based access control (RBAC).

Invocation Logging

Every tools/call invocation should be logged with the user's identity for auditing and security monitoring.

Operate large APIs at scale

LLMs have a finite context window. If your API has hundreds of endpoints, exposing them all as individual tools will quickly exhaust this limit, leading to poor performance or outright failure. You need strategies to manage this complexity.

Enable dynamic tools mode

Instead of exposing hundreds of tools, you can expose just three powerful meta-tools that allow the LLM to discover endpoints at runtime. This dynamic tools mode is enabled with a simple --tools=dynamic flag.

  1. list_api_endpoints: Lets the model search for relevant endpoints (e.g., "what can I do with users?").

  2. get_api_endpoint_schema: Lets the model retrieve the specific input schema for a chosen endpoint.

  3. invoke_api_endpoint: Lets the model execute the endpoint with the required parameters.

This indirect approach keeps the initial context small and allows the LLM to explore your entire API surface on demand.

Adapt schemas to client limits

Different MCP clients and the models behind them have varying levels of support for complex JSON Schema features. For instance, some clients struggle with $ref pointers, union types (anyOf), or have strict limits on tool name length, as detailed in what we learned converting complex OpenAPI specs to MCP servers.

To ensure broad compatibility, your server can adapt its schemas on the fly. By specifying the target client with a --client=cursor or --client=claude flag, the server can automatically apply necessary transformations, such as inlining references or simplifying unions, to match the client's known capabilities.

Deploy and integrate at scale

A robust deployment and versioning strategy is critical for maintaining your MCP server alongside your core API and SDKs. The goal is to create a seamless, automated workflow from code generation to production.

Package server as Docker image

For maximum portability, you can configure your project to automatically build and publish a Docker image for your MCP server. This containerized version can be easily deployed to any environment, from a local developer machine to a cloud-based container orchestration platform. The image is versioned and tagged in lock-step with your npm package.

Run serverless on Cloudflare workers

For remote MCP servers requiring OAuth, deploying to a serverless platform like Cloudflare Workers is an excellent choice. This provides a scalable, low-latency solution with zero cold starts. A generated worker template can handle the entire OAuth flow and serve the MCP tools, giving you a production-ready remote server with minimal setup.

Version SDK and server together

When you make a change to your OpenAPI spec, it should trigger a single, unified release process for both your TypeScript SDK and your MCP server. A CI-driven workflow creates one release pull request that bumps the version for both packages based on conventional commit messages. This ensures that your SDK and MCP server always stay in sync and are published together with a shared changelog.

Ready to generate an MCP server for your API? Get started for free.

Frequently asked questions about enterprise mcp servers

How do I limit tool access per environment?

You can manage environment-specific configurations by editing configs and OpenAPI specs with branches for your stainless.yml file. At runtime, you can then use --tag or --resource filters to expose only the tools appropriate for that environment, such as disabling destructive write operations in production.

Can I route traffic through my API gateway?

Yes, you can deploy your remote MCP server behind an existing API gateway. The gateway can handle initial traffic, and you can configure it to forward authentication headers or tokens to the MCP server, which then uses them to make authenticated calls to your backend API.

What performance overhead should I expect?

The overhead is generally minimal. The main costs are the JSON-RPC message framing and the initial tools/list call to fetch schemas. For subsequent tools/call requests, the latency is dominated by the execution time of your underlying API endpoint.

How do I support multiple API versions?

You can namespace your tool names (e.g., v1_create_user, v2_create_user) within a single MCP server. Alternatively, for major version changes, you can generate and publish entirely separate MCP server packages for each API version.

Should I build or generate my MCP server?

While you can build an MCP server from scratch, generating it from your OpenAPI spec saves significant development time and reduces maintenance. The Stainless SDK generator handles the boilerplate of schema transformation, transport layers, and client capability adaptation, letting you focus on the unique logic of your API.

This guide covers the architectural patterns, security considerations, and operational strategies you need to deploy MCP servers at enterprise scale. You'll learn how to structure TypeScript implementations, manage authentication flows, handle large API surfaces, and integrate MCP server deployment into your existing CI/CD workflows.
An MCP server, often built with a TypeScript SDK, acts as a universal adapter between your API and large language models. You can generate an MCP server from an OpenAPI spec to expose your enterprise API to AI agents with a single configuration and build process, letting you focus on core logic instead of boilerplate.

Why MCP matters for enterprise API teams

The Model Context Protocol, or MCP, provides a standardized way for applications like AI agents to discover and interact with your API's capabilities. Think of it like the Language Server Protocol (LSP) for APIs; instead of every editor implementing its own support for every language, LSP provides a single interface. Similarly, MCP allows any AI agent to connect to any API that exposes an MCP server, eliminating the need for bespoke, one-off integrations for every model and platform.

This approach solves a major headache for engineering teams. Without a standard, you're stuck building and maintaining a fragmented mess of plugins, custom ReAct chains, and vendor-specific API wrappers. MCP replaces that integration sprawl with a single, stable, and discoverable layer that you control.

Solve AI integration sprawl

Instead of building a separate plugin for ChatGPT, a custom tool for an internal agent, and another integration for Claude, you build one MCP server. This single implementation serves all current and future MCP-compatible clients, drastically reducing maintenance overhead and ensuring a consistent experience.

Apply lsp lessons to APIs

The Language Server Protocol succeeded because it allowed tools to be designed at the right altitude, separating the language's logic from the editor's UI. MCP applies the same principle to AI. It separates your API's core business logic from the AI agent's reasoning, allowing each to evolve independently while communicating through a stable, well-defined protocol.

Architect MCP servers with TypeScript

At its core, an MCP server built with TypeScript consists of a few key components. It needs a transport layer to communicate, a set of tool schemas to define its capabilities, and the logic to handle incoming requests. Creating an MCP server from an OpenAPI spec automates the tedious parts, like type generation, and gives you a clean structure to add custom business logic.

Choose transport layer

Your server needs a way to talk to the client. MCP supports two primary transport mechanisms for this.

  • Stdio: Standard input/output is perfect for local development. When you run the MCP server as a child process from a desktop client like Claude Desktop or Cursor, stdio provides a fast and simple communication channel.

  • Streamable http: For production and remote scenarios, you'll use HTTP. This allows web-based clients like claude.ai to securely connect to your server over the internet, typically involving an OAuth flow for authentication.

Define tool schemas

Each function you expose to an LLM is a tool, and each tool needs a schema defining its inputs. These are typically defined using JSON Schema. A robust generator can create these schemas directly from your OpenAPI endpoint definitions, mapping path parameters, query parameters, and request bodies into a single, coherent input object for the LLM.

Share code between SDK and server

A common pattern is to manage your TypeScript SDK and MCP server within a single monorepo. The generated MCP server can live in a sub-directory like packages/mcp-server. This structure makes it easy to share code, such as type definitions and helper functions, between your main SDK and the MCP server, ensuring consistency and reducing code duplication while maintaining custom code that persists through regenerated code.

Expose endpoints with confidence

Exposing your entire API surface to an LLM isn't always the best strategy. It can overwhelm the model's context window and introduce security risks. A more deliberate approach involves selectively exposing endpoints and grouping them logically.

You can start by setting enable_all_resources: false in your configuration and then explicitly opting in specific resources or endpoints with mcp: true. This gives you fine-grained control over what the AI agent can see and do.

Group endpoints by risk

A powerful pattern is to tag your endpoints to create logical groupings that users can filter. This allows for a least-privilege approach where a user might only enable read-only or low-risk tools.

You can create these groups using filters based on:

  • Operation: Allow users to enable only read (GET/LIST) or write (POST/PUT/DELETE) operations.

  • Tag: Define custom tags in your config (e.g., invoicing, admin) that users can selectively enable with a --tag flag.

  • Scope: In a remote server, you can dynamically filter the tool list based on the user's authenticated OAuth scopes.

Override tool metadata

The default names and descriptions generated from your OpenAPI spec might not be ideal for an LLM. You can override the tool_name and description for any endpoint to provide clearer, more concise instructions, which significantly improves the model's ability to choose the right tool for a given task.

# stainless.yml
resources:
  payments:
    methods:
      create:
        mcp:
          tool_name: initialize_payment
          description: |
            Starts a new payment transaction. Does not complete it.

Secure MCP servers with enterprise auth

For local development, passing an API key via an environment variable is simple and effective. For production, especially with remote servers accessed by web clients, you must use a more secure and flexible authentication method like OAuth 2.0.

Implement Oauth flow

A remote MCP server needs to handle an OAuth redirect flow to authenticate users. A generated Cloudflare Worker template can provide this out-of-the-box, including a consent screen to collect the necessary API keys or tokens from the user. This worker can be customized to integrate with your existing identity providers like Google or GitHub, or to implement a fully custom authentication flow.

Enforce role based access

Security isn't just about authentication; it's also about authorization. Once a user is authenticated, your MCP server should enforce access controls.

Access Control Method

Description

Static Filtering

Users can use command-line flags like --resource or --tag to limit the tools loaded into their local client.

Dynamic Scopes

In a remote server, you can inspect the user's OAuth token and its associated scopes during the tools/list request.

Per-User Tool Lists

Based on the user's identity, you can return a different set of available tools, effectively enforcing role-based access control (RBAC).

Invocation Logging

Every tools/call invocation should be logged with the user's identity for auditing and security monitoring.

Operate large APIs at scale

LLMs have a finite context window. If your API has hundreds of endpoints, exposing them all as individual tools will quickly exhaust this limit, leading to poor performance or outright failure. You need strategies to manage this complexity.

Enable dynamic tools mode

Instead of exposing hundreds of tools, you can expose just three powerful meta-tools that allow the LLM to discover endpoints at runtime. This dynamic tools mode is enabled with a simple --tools=dynamic flag.

  1. list_api_endpoints: Lets the model search for relevant endpoints (e.g., "what can I do with users?").

  2. get_api_endpoint_schema: Lets the model retrieve the specific input schema for a chosen endpoint.

  3. invoke_api_endpoint: Lets the model execute the endpoint with the required parameters.

This indirect approach keeps the initial context small and allows the LLM to explore your entire API surface on demand.

Adapt schemas to client limits

Different MCP clients and the models behind them have varying levels of support for complex JSON Schema features. For instance, some clients struggle with $ref pointers, union types (anyOf), or have strict limits on tool name length, as detailed in what we learned converting complex OpenAPI specs to MCP servers.

To ensure broad compatibility, your server can adapt its schemas on the fly. By specifying the target client with a --client=cursor or --client=claude flag, the server can automatically apply necessary transformations, such as inlining references or simplifying unions, to match the client's known capabilities.

Deploy and integrate at scale

A robust deployment and versioning strategy is critical for maintaining your MCP server alongside your core API and SDKs. The goal is to create a seamless, automated workflow from code generation to production.

Package server as Docker image

For maximum portability, you can configure your project to automatically build and publish a Docker image for your MCP server. This containerized version can be easily deployed to any environment, from a local developer machine to a cloud-based container orchestration platform. The image is versioned and tagged in lock-step with your npm package.

Run serverless on Cloudflare workers

For remote MCP servers requiring OAuth, deploying to a serverless platform like Cloudflare Workers is an excellent choice. This provides a scalable, low-latency solution with zero cold starts. A generated worker template can handle the entire OAuth flow and serve the MCP tools, giving you a production-ready remote server with minimal setup.

Version SDK and server together

When you make a change to your OpenAPI spec, it should trigger a single, unified release process for both your TypeScript SDK and your MCP server. A CI-driven workflow creates one release pull request that bumps the version for both packages based on conventional commit messages. This ensures that your SDK and MCP server always stay in sync and are published together with a shared changelog.

Ready to generate an MCP server for your API? Get started for free.

Frequently asked questions about enterprise mcp servers

How do I limit tool access per environment?

You can manage environment-specific configurations by editing configs and OpenAPI specs with branches for your stainless.yml file. At runtime, you can then use --tag or --resource filters to expose only the tools appropriate for that environment, such as disabling destructive write operations in production.

Can I route traffic through my API gateway?

Yes, you can deploy your remote MCP server behind an existing API gateway. The gateway can handle initial traffic, and you can configure it to forward authentication headers or tokens to the MCP server, which then uses them to make authenticated calls to your backend API.

What performance overhead should I expect?

The overhead is generally minimal. The main costs are the JSON-RPC message framing and the initial tools/list call to fetch schemas. For subsequent tools/call requests, the latency is dominated by the execution time of your underlying API endpoint.

How do I support multiple API versions?

You can namespace your tool names (e.g., v1_create_user, v2_create_user) within a single MCP server. Alternatively, for major version changes, you can generate and publish entirely separate MCP server packages for each API version.

Should I build or generate my MCP server?

While you can build an MCP server from scratch, generating it from your OpenAPI spec saves significant development time and reduces maintenance. The Stainless SDK generator handles the boilerplate of schema transformation, transport layers, and client capability adaptation, letting you focus on the unique logic of your API.

Featured MCP Resources

Essential events, guides and insights to help you master MCP server development.

Featured MCP Resources

Essential events, guides and insights to help you master MCP server development.

Featured MCP Resources

Essential events, guides and insights to help you master MCP server development.