All posts
This is some text inside of a div block.
Engineering

From API to MCP: a practical guide for developers

AI agents already interact with your API, whether it’s clicking buttons on a web app, making HTTP requests, or running SDK code—but these are all interfaces for humans, not machines. That’s why more and more API providers are integrating with the Model Context Protocol (MCP), which gives LLMs a method to discover, reason about, and interact with applications.

MCP servers expose tools, with each tool having a title, description, and input schema. AI agents use MCP clients to interpret these tools and craft requests. The server converts these requests into API calls, and their results are returned to the client. Adopting MCP is a pragmatic way to make your API more AI-ready, as it builds on your existing infrastructure.

A straightforward way to make an MCP server from your API is to expose each endpoint as its own tool. But this naïve conversion is often suboptimal, and with some thoughtful deliberation and a little work, it’s possible to get much better results. At Stainless, we’ve gained a lot of experience making APIs not only MCP-compatible, but LLM-friendly, and we’ve distilled our approach into some guidelines.

Context windows

A major consideration when designing a large MCP server that doesn’t come up when designing your typical RESTful API is the context window.

When an LLM connects to your MCP server, it sends a tools/list request to get the available tools and store the result in the context window. If your server only has three tools, each with a sentence-length description and an input schema with five keys, then great!

But if your API has hundreds of endpoints, each with its own tool, and each tool with its own name, description, and input schema—then your tool list can overflow the LLM context. Even if you only provide a dozen tools, if each of those tools has a large input schema, your tool list can exceed the window size. In our experience, LLMs perform better when the context window is small and focused.

These constraints taught us several things about tool design:

  1. Strategize endpoint exposure. When an API has a wide range of resources, it’s often helpful to select which ones an MCP client is exposed to. Too many tools can be confusing for an LLM.
  2. Condense tool descriptions. Every token used for a tool description is one less token available for AI reasoning. Verbose descriptions, while helpful, consume valuable context space.
  3. Simplify input schemas. The simpler your parameter structure is, the higher the chances are that an AI agent correctly invokes your tool.

These lessons are also good API design principles, especially when you consider their opposites. Having too many endpoints suggests that your resources might do too many things. An endpoint without a succinct description might be unfocused. A complicated input schema could signify bad defaults. A well-designed API serves as an important foundation for a useful MCP server.

Tool architecture

A user expects an SDK to support your entire API, with all its endpoints, and every single parameter, down to the most obscure options. MCP server generation, however, requires more nuance when deciding the architecture. You could:

  1. Provide all the endpoints, as you would with an SDK.
  2. Select subsets of endpoints for specific use-cases.
  3. Create composite tools and workflows that wrap existing endpoints.

Each of these options can be viable, depending on the situation.

Path 1: Provide all the endpoints (direct approach)

The simplest option for structuring your MCP server is to expose each endpoint as its own tool.

Consider doing this when:

  • Your OpenAPI spec has clean, robust schemas that an LLM can easily interpret.
  • Your API operations are particularly simple, uniform (perhaps CRUD-like), or independent.
  • You want to quickly validate MCP’s value before investing in customization.

If you have a Stainless project, you can generate an MCP server this way by adding these options to your Stainless config:

  
targets:
  typescript:
    options:
      mcp_server:
        package_name: my-org-mcp # this is the default
        enable_all_resources: true
  
  

Generating your TypeScript SDK will create a subpackage at packages/mcp-server. You then publish this as a separate NPM package, here my-org-mcp, which you can then run with npx -y my-org-ncp.

Even though it’s simple, the “generate everything” approach is sometimes the correct way to expose an API. Even if it isn't, shipping any MCP server lets you gather reports from users about which endpoints AI models use effectively and which they struggle with. To gather more data, you could test this programmatically by doing end-to-end testing; the randomized nature of the results can be offset by borrowing techniques from LLM evaluations. You quickly gather insights for future optimization without committing resources upfront.

Path 2: Select subsets of endpoints (targeted approach)

Some API endpoints don’t make sense as an MCP tool. In these cases, you might want to exclude some endpoints from being exposed as tools.

Consider doing this when:

  • Your API deals with particularly sensitive domains, like payments or personal information.
  • You have some endpoints that do irreversible and potentially destructive operations.
  • Your operations vary greatly in their complexity or style.
  • You have endpoints that are in beta or are unstable.

With a Stainless-generated MCP server, you can exclude specific resources or endpoints from generation:

  
targets:
  typescript:
    options:
      mcp_server:
        enable_all_resources: false
resources:
  accounts:
    mcp: true # all methods
  payments:
    methods:
      list:
        mcp: true
      create:
	 ... # excluded by default
  
  

A further step is to select specific subsets of endpoints for different purposes, particularly if you have a large API. Use cases include:

  • grouping endpoints used together for a specific workflow, especially if your product caters to several kinds of users;
  • grouping endpoints that require particular permissions, like admin-only endpoints;
  • or separating endpoints that have side-effects from ones that are idempotent.

End-users can find it useful to be able to select custom groups of endpoints for themselves. Many local MCP servers provide customization via environment variables or command-line arguments. Stainless-generated MCP servers allow end-users to filter these groups by resource or operation:

  
npx your-api-mcp --resource accounts --operation read
  
  

Additionally, you can configure custom tags for your endpoints to make this filtering easier for end-users.

Path 3: Create composite tools (indirect approach)

A different path would be to create composite tools that abstract over several endpoints.

Consider doing this when:

  • Your API has a large number of endpoints, and you want your server to support all of them.
  • Your operations involve calling multiple APIs in a specific sequence.
  • You have endpoints with large, complex schemas that are unnecessary for most cases.

An example of composite tools are the set of dynamic tools that Stainless generates for MCP servers. These higher-level tools, list_api_endpoints, get_api_endpoint_schema, and invoke_api_endpoint, allow an LLM to search for, look up, and call endpoints on demand. This works around the limitations of the context window by using a level of indirection (as the fundamental theorem of software engineering teaches us).

If you have an SDK with helper methods that don’t correspond to a specific endpoint, then it might be helpful to convert helper methods into composite tools. In general, if you have a Stainless-generated SDK that’s been patched with custom code, you may want to adapt your patches to your MCP server as well, by using more custom code.

You also don’t have to create all your tools from scratch. An SDK and a well-architected MCP server can make it straightforward to add custom tools. For example, this is what adding a custom tool to a Stainless-generated MCP server looks like:

JavaScript
  
import { Tool } from "@modelcontextprotocol/sdk/types.js";
import Client from "your-api";
import { server, endpoints, init } from "your-api-mcp/server";
import { z } from "zod/v4";

const createAndCompleteTodo = {
  tool: {
    name: "create_and_complete_todo",
    description: "Create a todo item and immediately mark it as completed.",
    inputSchema: z.toJSONSchema(z.object({ title: z.string() })),
  } satisfies Tool,
  handler: async (client: Client, args: any) => {
    const todo = await client.todos.create(args);
    await client.todos.complete(todo.id);
    return { content: "ok" };
  }),
};

init({ server, endpoints: [...endpoints, createAndCompleteTodo] });
  
  

The right approach often evolves over time—starting simple and iterating based on actual usage patterns typically yields better results than attempting to perfect your implementation from the outset.

Tool design

The right tool structure means nothing if the individual tools are poorly designed.

We’ve found that even small tweaks to tool names and descriptions can drastically affect how an LLM chooses between available tools. This is important enough that Stainless provides a configuration option for overriding tool names and descriptions:

  
resources:
  payments:
    methods:
      create:
        mcp:
          tool_name: initialize_payment
          description: |
            Start a payment transaction. Once a transaction is started, it then needs to
            be finalized or cancelled.
  
  

The other major piece of data that a tool carries is its input schema. In the past, we’ve touched on the many technicalities of writing schemas for MCP servers, like handling refs or recursive schemas, or dealing with top-level unions. The gist is that simpler schemas work better. If your APIs have complicated input schemas, when converting them to tools, consider simplifying them:

  1. Flatten nested fields, by adding a layer that converts between flat and nested structures.
  2. Update field descriptions to be concise and self-contained.
  3. When possible, reduce the number of parameters, preferring strong defaults.

The Stainless generator doesn’t yet support subsetting schemas for MCP generation, but you can always use custom code to simplify individual tools as needed.

While not part of a tool’s data, the size and format of a tool’s output also matters greatly. In our experience, the best tools are ones that return narrow and focused outputs. List endpoints are particularly susceptible to returning huge amounts of output, and their results can fill the prompt context with irrelevant data. Rather than returning full resources in a list endpoint, it can be worth returning minimal resources that clients can retrieve to get more information.

Testing and iteration

No amount of theory can replace testing with actual AI models. For debugging specific tool calls, and checking that your MCP server follows the spec, the MCP inspector serves as an invaluable tool. But to test how LLMs actually perform against your server, the best testing method is to run end-to-end experiments with actual MCP clients.

We recommend testing against MCP clients you expect users to actually use. It’s possible that Claude Desktop does a great job discovering and calling your tools, but for OpenAI agents to perform poorly. Not only do clients differ based on their underlying LLM, but also in which parts of the MCP specification they support.

Live testing reveals insights that aren't obvious from inspecting a list of tools:

  • Does a certain parameter consistently receive incorrect values? Try making the description of that parameter in the input schema more specific.
  • Do AI agents repeatedly misunderstand the functionality of a tool? Try changing its name or description, or even consider not exposing it as a tool.
  • Do interactions feel “slow” because clients often call the same set of tools in a sequence? Try making a composite tool that calls several API endpoints in sequence.
  • Or do interactions feel “slow” because the outputs are too large? Try preprocessing the results to remove irrelevant data before returning it from the MCP server.

Iteration is the key to success. Each cycle brings your MCP server closer to an interface that feels natural and intuitive for AI interaction.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Posted by
CJ Quines
CJ Quines
Software Engineer
David Ackerman
David Ackerman
Software Engineer