As more AI models interact with real-world tools and services, the Model Context Protocol (MCP) has emerged as a way to connect large language models (LLMs) to APIs. MCP standardizes how LLMs access external tools, enabling structured tool calls through a consistent interface.
Many APIs require authentication to protect user data and restrict access. This makes authentication a critical part of building a secure MCP server. Whether the server accesses a public API, a private database, or user-specific cloud storage, it needs to authenticate on behalf of the LLM.
What is MCP server authentication?
The Model Context Protocol (MCP) is a JSON-RPC-based protocol that allows large language models to call tools hosted on external servers. These tools might wrap APIs, access files, or perform computations.
MCP server authentication is the process of verifying the identity of the user or agent making a request through an MCP client. It enables the MCP server to securely access third-party services on behalf of a user or session.
Authentication is separate from authorization. Authentication proves who the requester is. Authorization determines what actions they can perform. In the MCP lifecycle, authentication typically happens when the client connects to the server or when the server accesses upstream APIs.
Key components involved in MCP server authentication include:
MCP server: A program that hosts tools and exposes them to LLM-based clients via MCP
Authentication: The process of verifying identity, often using OAuth or API keys
Authorization: The process of determining permitted actions, typically scoped by access tokens
In real-world scenarios, authentication enables AI agents to perform tasks like reading emails, updating calendars, or querying databases securely. For example, an LLM might use an MCP server to access a user's GitHub account to create issues or pull requests. Authentication ensures that only authorized users can trigger those actions through tool calls.
Why OAuth 2.1 matters for MCP
OAuth 2.1 is the authentication standard used by MCP servers. It defines how clients prove their identity and obtain permission to access protected resources.
This standard was selected for MCP because it's widely adopted, familiar to developers, and includes updates that remove deprecated or less secure features from earlier versions.
In the context of MCP, OAuth 2.1 is used when an LLM client connects to a server that wraps a third-party API. The MCP server initiates the OAuth flow, obtains tokens from the authorization server, and uses those tokens to access the upstream API on behalf of the user.
Here's a simple example of an OAuth 2.1 flow in an MCP server:
// Example: MCP server using OAuth token to access external API const token = await getAccessTokenFromAuthServer(); const response = await fetch("https://api.example.com/user", { headers: { Authorization: `Bearer ${token}` } });
Core OAuth 2.1 flows
The most common flow used in MCP is the authorization code flow. This is designed for applications that can securely store secrets or communicate with a backend.
In this flow:
The client redirects the user to the authorization server
The user logs in and grants permission
The server returns an authorization code to the client
The client exchanges this code for an access token
The access token is then used to authenticate future requests.
Authentication is token-based. Once the client has a valid token, it includes it in the Authorization
header of the HTTP request.
PKCE for public clients
PKCE (Proof Key for Code Exchange) is a security extension to OAuth 2.1. It's required for public clients—clients that can't securely store secrets, such as desktop applications or browser-based tools.
PKCE prevents attackers from stealing authorization codes by requiring the client to generate a secret code challenge and verify it later during token exchange.
Benefits of PKCE:
Enhanced security: Attaches a one-time verifier to each request
Protection against attacks: Blocks interception of authorization codes
Compatibility: Works with public and confidential clients
Dynamic client registration
Dynamic client registration is defined in RFC 7591. It allows clients to register with an authorization server programmatically.
This mechanism helps MCP clients onboard with minimal manual setup. Instead of pre-registering each client in advance, the client sends a registration request and receives credentials in response.
Using API keys or secrets in MCP
MCP servers can authenticate to third-party APIs using either API keys or OAuth 2.1. Both methods allow the server to make authorized requests, but they differ in how credentials are issued, stored, and used.
API keys are static credentials. These are typically issued by the upstream API and embedded in the server's environment or configuration. OAuth 2.1 uses dynamically scoped tokens issued through a flow that involves user interaction or delegated authorization.
In MCP, API keys can be used when the upstream API doesn't support OAuth or when the authentication requirements are service-level only. OAuth 2.1 is used when user-level authorization is required or when access scopes need to be tightly controlled.
Authentication Method | Best For | Security Level | Implementation Complexity |
---|---|---|---|
API Keys | Service-to-service APIs | Low to Medium | Low |
OAuth 2.1 | User-authorized access via tokens | High | Medium to High |
Here's an example of using an API key in an MCP tool implementation:
// Using an API key to access an external weather API const API_KEY = process.env.WEATHER_API_KEY; async function getForecast(city) { const response = await fetch( `https://api.weather.com/v1/forecast?city=${city}`, { headers: { 'Authorization': `Bearer ${API_KEY}` } } ); return await response.json(); }
This approach works for simple integrations, especially when no user authentication is needed and the key doesn't expire frequently.
Embedded vs external authorization servers
MCP servers use OAuth 2.1 to authenticate clients and issue access tokens. There are two ways to implement this: the embedded approach and the external (delegated) approach.
In an embedded approach, the MCP server itself includes the logic and infrastructure to act as the authorization server. In a delegated approach, the MCP server relies on an existing external authorization service to handle OAuth responsibilities.
Embedded approach
In the embedded model, the MCP server acts as the authorization server. It manages user login, client registration, token issuance, and validation internally.
This means the server is responsible for implementing all required OAuth 2.1 endpoints, including support for PKCE, dynamic client registration, and metadata discovery.
Pros and cons of the embedded approach:
Pros: Full control over authentication, no external dependencies, easier to test locally
Cons: High implementation complexity, requires secure token storage, increases maintenance scope
Delegated approach
In the delegated model, the MCP server acts only as an OAuth resource server. It offloads authentication and authorization to a third-party service such as an identity provider.
Popular external providers include Auth0, Stytch, and WorkOS.
This setup separates concerns. The external provider owns identity and token lifecycle. The MCP server focuses on tool access and authorization enforcement.
Implementation considerations include:
Configuring token validation
Mapping token claims to internal permissions
Supporting OAuth 2.1 metadata endpoints via proxy
Comparing complexity and security
Factor | Embedded Approach | External Approach |
---|---|---|
Setup Complexity | High — requires implementing all OAuth logic | Medium — relies on standard OAuth configuration |
Maintenance | High — full ownership of token and login flows | Low — provider handles updates and compliance |
Security | Variable — depends on implementation quality | High — leverages mature infrastructure |
Scalability | Medium — constrained by server capabilities | High — benefits from provider infrastructure |
Handling large APIs and dynamic endpoints
Some APIs include hundreds of endpoints. Converting all of them into MCP tools at once introduces performance and usability problems. MCP servers currently load all tool schemas into a shared context, and the client relies on the LLM to choose which tool to use.
LLMs have context limits, which means they can only consider a certain number of tokens at once. If too many tools are loaded, the LLM may ignore some or fail to select the right one.
Partial endpoint loading
To work around these issues, MCP servers can load only a subset of tools based on the task or user input. Instead of registering every tool at startup, the server selectively exposes tools as needed.
Common strategies include:
Context-based loading: The server loads tools based on the current conversation
User intent filtering: The server infers user intent and loads matching tools
Dynamic discovery: The client explicitly requests tool definitions before use
These approaches reduce memory use and align with LLM context constraints.
Reference resolution at scale
Large OpenAPI specifications often include many $ref
references. These are links to reusable schema components defined elsewhere in the spec. MCP requires self-contained tool schemas, so these references must be resolved before tool registration.
Resolving $ref
at scale involves inlining all referenced schema definitions. To avoid circular references or excessive schema expansion, the process includes detecting cycles and deduplicating identical definitions.
Here's a simplified example of resolving a reference:
// Original schema with $ref { "type": "object", "properties": { "user": { "$ref": "#/components/schemas/User" } } } // After reference resolution { "type": "object", "properties": { "user": { "type": "object", "properties": { "id": { "type": "string" }, "email": { "type": "string", "format": "email" } }, "required": ["id", "email"] } } }
This expanded structure is what the MCP server uses to register the tool.
LLM-based clients and MCP
Large language models interact with MCP servers by acting as clients. These clients connect to the server and send structured requests to call tools. The MCP server processes the request and sends back a response that the LLM can use in its output.
Authentication is required when an MCP server wraps protected resources, such as user accounts or third-party APIs. The LLM client must prove its identity or the identity of the user it represents.
Token issuance for AI agents
AI agents acting as MCP clients receive access tokens by completing an OAuth 2.1 flow. This flow may involve redirecting the user to log in with a third-party service and grant permission.
After the user consents, the AI agent receives an access token that it includes in the Authorization header when making requests to the MCP server.
Security considerations for AI agents:
AI agents may run in environments that can't securely store long-lived secrets
Tokens might be cached in memory or logs, creating risk if not handled properly
Public clients must use PKCE to prevent token interception
Selecting tools dynamically
Once authenticated, an LLM client can access a list of available tools from the MCP server. The list of tools may vary depending on the access token. The server filters tools based on the scopes or permissions encoded in the token.
When the LLM receives a user prompt, it decides whether to call a tool. If tools are dynamically selected, the client checks which tools are currently available and allowed for the session.
For example, a token might include permission for the "calendar" and "email" scopes. The MCP server will expose tools related to scheduling and email handling. Tools outside those scopes, such as "file_upload", won't appear in the client's tool list.
Looking ahead to streamlined integration
MCP authentication continues to evolve as the protocol becomes more widely adopted. The current specification defines the MCP server as both a resource server and, optionally, an authorization server.
Future updates to the MCP specification may offer clearer separation of responsibilities. There are active discussions about allowing MCP servers to fully delegate authentication to external providers, rather than implementing authorization server features directly.
At Stainless, we generate MCP servers with built-in support for OAuth 2.1, PKCE, and bearer token validation. Servers created through our platform expose properly scoped authorization metadata and include support for external authorization servers when needed.
This allows developers to align with current spec requirements without manually implementing OAuth endpoints.
FAQs about MCP server authentication
How do I fix server authentication failed errors in MCP?
Server authentication failures in MCP usually occur when the OAuth configuration is incorrect or when access tokens are no longer valid. Check your client credentials, verify redirect URIs, and ensure your authorization server is properly configured with valid endpoints.
How do I implement MCP server authentication for my API?
To implement MCP server authentication, select either an embedded or external authorization server model, configure OAuth 2.1 flows with PKCE support, and expose the required endpoints such as /authorize
, /token
, and optionally /register
for dynamic client registration.
Can I use existing authentication providers with MCP servers?
Yes, an MCP server can rely on external authorization servers like Auth0, Stytch, or WorkOS. These providers handle token issuance and validation while the MCP server enforces access control based on the token's scopes or claims.
As more AI models interact with real-world tools and services, the Model Context Protocol (MCP) has emerged as a way to connect large language models (LLMs) to APIs. MCP standardizes how LLMs access external tools, enabling structured tool calls through a consistent interface.
Many APIs require authentication to protect user data and restrict access. This makes authentication a critical part of building a secure MCP server. Whether the server accesses a public API, a private database, or user-specific cloud storage, it needs to authenticate on behalf of the LLM.
What is MCP server authentication?
The Model Context Protocol (MCP) is a JSON-RPC-based protocol that allows large language models to call tools hosted on external servers. These tools might wrap APIs, access files, or perform computations.
MCP server authentication is the process of verifying the identity of the user or agent making a request through an MCP client. It enables the MCP server to securely access third-party services on behalf of a user or session.
Authentication is separate from authorization. Authentication proves who the requester is. Authorization determines what actions they can perform. In the MCP lifecycle, authentication typically happens when the client connects to the server or when the server accesses upstream APIs.
Key components involved in MCP server authentication include:
MCP server: A program that hosts tools and exposes them to LLM-based clients via MCP
Authentication: The process of verifying identity, often using OAuth or API keys
Authorization: The process of determining permitted actions, typically scoped by access tokens
In real-world scenarios, authentication enables AI agents to perform tasks like reading emails, updating calendars, or querying databases securely. For example, an LLM might use an MCP server to access a user's GitHub account to create issues or pull requests. Authentication ensures that only authorized users can trigger those actions through tool calls.
Why OAuth 2.1 matters for MCP
OAuth 2.1 is the authentication standard used by MCP servers. It defines how clients prove their identity and obtain permission to access protected resources.
This standard was selected for MCP because it's widely adopted, familiar to developers, and includes updates that remove deprecated or less secure features from earlier versions.
In the context of MCP, OAuth 2.1 is used when an LLM client connects to a server that wraps a third-party API. The MCP server initiates the OAuth flow, obtains tokens from the authorization server, and uses those tokens to access the upstream API on behalf of the user.
Here's a simple example of an OAuth 2.1 flow in an MCP server:
// Example: MCP server using OAuth token to access external API const token = await getAccessTokenFromAuthServer(); const response = await fetch("https://api.example.com/user", { headers: { Authorization: `Bearer ${token}` } });
Core OAuth 2.1 flows
The most common flow used in MCP is the authorization code flow. This is designed for applications that can securely store secrets or communicate with a backend.
In this flow:
The client redirects the user to the authorization server
The user logs in and grants permission
The server returns an authorization code to the client
The client exchanges this code for an access token
The access token is then used to authenticate future requests.
Authentication is token-based. Once the client has a valid token, it includes it in the Authorization
header of the HTTP request.
PKCE for public clients
PKCE (Proof Key for Code Exchange) is a security extension to OAuth 2.1. It's required for public clients—clients that can't securely store secrets, such as desktop applications or browser-based tools.
PKCE prevents attackers from stealing authorization codes by requiring the client to generate a secret code challenge and verify it later during token exchange.
Benefits of PKCE:
Enhanced security: Attaches a one-time verifier to each request
Protection against attacks: Blocks interception of authorization codes
Compatibility: Works with public and confidential clients
Dynamic client registration
Dynamic client registration is defined in RFC 7591. It allows clients to register with an authorization server programmatically.
This mechanism helps MCP clients onboard with minimal manual setup. Instead of pre-registering each client in advance, the client sends a registration request and receives credentials in response.
Using API keys or secrets in MCP
MCP servers can authenticate to third-party APIs using either API keys or OAuth 2.1. Both methods allow the server to make authorized requests, but they differ in how credentials are issued, stored, and used.
API keys are static credentials. These are typically issued by the upstream API and embedded in the server's environment or configuration. OAuth 2.1 uses dynamically scoped tokens issued through a flow that involves user interaction or delegated authorization.
In MCP, API keys can be used when the upstream API doesn't support OAuth or when the authentication requirements are service-level only. OAuth 2.1 is used when user-level authorization is required or when access scopes need to be tightly controlled.
Authentication Method | Best For | Security Level | Implementation Complexity |
---|---|---|---|
API Keys | Service-to-service APIs | Low to Medium | Low |
OAuth 2.1 | User-authorized access via tokens | High | Medium to High |
Here's an example of using an API key in an MCP tool implementation:
// Using an API key to access an external weather API const API_KEY = process.env.WEATHER_API_KEY; async function getForecast(city) { const response = await fetch( `https://api.weather.com/v1/forecast?city=${city}`, { headers: { 'Authorization': `Bearer ${API_KEY}` } } ); return await response.json(); }
This approach works for simple integrations, especially when no user authentication is needed and the key doesn't expire frequently.
Embedded vs external authorization servers
MCP servers use OAuth 2.1 to authenticate clients and issue access tokens. There are two ways to implement this: the embedded approach and the external (delegated) approach.
In an embedded approach, the MCP server itself includes the logic and infrastructure to act as the authorization server. In a delegated approach, the MCP server relies on an existing external authorization service to handle OAuth responsibilities.
Embedded approach
In the embedded model, the MCP server acts as the authorization server. It manages user login, client registration, token issuance, and validation internally.
This means the server is responsible for implementing all required OAuth 2.1 endpoints, including support for PKCE, dynamic client registration, and metadata discovery.
Pros and cons of the embedded approach:
Pros: Full control over authentication, no external dependencies, easier to test locally
Cons: High implementation complexity, requires secure token storage, increases maintenance scope
Delegated approach
In the delegated model, the MCP server acts only as an OAuth resource server. It offloads authentication and authorization to a third-party service such as an identity provider.
Popular external providers include Auth0, Stytch, and WorkOS.
This setup separates concerns. The external provider owns identity and token lifecycle. The MCP server focuses on tool access and authorization enforcement.
Implementation considerations include:
Configuring token validation
Mapping token claims to internal permissions
Supporting OAuth 2.1 metadata endpoints via proxy
Comparing complexity and security
Factor | Embedded Approach | External Approach |
---|---|---|
Setup Complexity | High — requires implementing all OAuth logic | Medium — relies on standard OAuth configuration |
Maintenance | High — full ownership of token and login flows | Low — provider handles updates and compliance |
Security | Variable — depends on implementation quality | High — leverages mature infrastructure |
Scalability | Medium — constrained by server capabilities | High — benefits from provider infrastructure |
Handling large APIs and dynamic endpoints
Some APIs include hundreds of endpoints. Converting all of them into MCP tools at once introduces performance and usability problems. MCP servers currently load all tool schemas into a shared context, and the client relies on the LLM to choose which tool to use.
LLMs have context limits, which means they can only consider a certain number of tokens at once. If too many tools are loaded, the LLM may ignore some or fail to select the right one.
Partial endpoint loading
To work around these issues, MCP servers can load only a subset of tools based on the task or user input. Instead of registering every tool at startup, the server selectively exposes tools as needed.
Common strategies include:
Context-based loading: The server loads tools based on the current conversation
User intent filtering: The server infers user intent and loads matching tools
Dynamic discovery: The client explicitly requests tool definitions before use
These approaches reduce memory use and align with LLM context constraints.
Reference resolution at scale
Large OpenAPI specifications often include many $ref
references. These are links to reusable schema components defined elsewhere in the spec. MCP requires self-contained tool schemas, so these references must be resolved before tool registration.
Resolving $ref
at scale involves inlining all referenced schema definitions. To avoid circular references or excessive schema expansion, the process includes detecting cycles and deduplicating identical definitions.
Here's a simplified example of resolving a reference:
// Original schema with $ref { "type": "object", "properties": { "user": { "$ref": "#/components/schemas/User" } } } // After reference resolution { "type": "object", "properties": { "user": { "type": "object", "properties": { "id": { "type": "string" }, "email": { "type": "string", "format": "email" } }, "required": ["id", "email"] } } }
This expanded structure is what the MCP server uses to register the tool.
LLM-based clients and MCP
Large language models interact with MCP servers by acting as clients. These clients connect to the server and send structured requests to call tools. The MCP server processes the request and sends back a response that the LLM can use in its output.
Authentication is required when an MCP server wraps protected resources, such as user accounts or third-party APIs. The LLM client must prove its identity or the identity of the user it represents.
Token issuance for AI agents
AI agents acting as MCP clients receive access tokens by completing an OAuth 2.1 flow. This flow may involve redirecting the user to log in with a third-party service and grant permission.
After the user consents, the AI agent receives an access token that it includes in the Authorization header when making requests to the MCP server.
Security considerations for AI agents:
AI agents may run in environments that can't securely store long-lived secrets
Tokens might be cached in memory or logs, creating risk if not handled properly
Public clients must use PKCE to prevent token interception
Selecting tools dynamically
Once authenticated, an LLM client can access a list of available tools from the MCP server. The list of tools may vary depending on the access token. The server filters tools based on the scopes or permissions encoded in the token.
When the LLM receives a user prompt, it decides whether to call a tool. If tools are dynamically selected, the client checks which tools are currently available and allowed for the session.
For example, a token might include permission for the "calendar" and "email" scopes. The MCP server will expose tools related to scheduling and email handling. Tools outside those scopes, such as "file_upload", won't appear in the client's tool list.
Looking ahead to streamlined integration
MCP authentication continues to evolve as the protocol becomes more widely adopted. The current specification defines the MCP server as both a resource server and, optionally, an authorization server.
Future updates to the MCP specification may offer clearer separation of responsibilities. There are active discussions about allowing MCP servers to fully delegate authentication to external providers, rather than implementing authorization server features directly.
At Stainless, we generate MCP servers with built-in support for OAuth 2.1, PKCE, and bearer token validation. Servers created through our platform expose properly scoped authorization metadata and include support for external authorization servers when needed.
This allows developers to align with current spec requirements without manually implementing OAuth endpoints.
FAQs about MCP server authentication
How do I fix server authentication failed errors in MCP?
Server authentication failures in MCP usually occur when the OAuth configuration is incorrect or when access tokens are no longer valid. Check your client credentials, verify redirect URIs, and ensure your authorization server is properly configured with valid endpoints.
How do I implement MCP server authentication for my API?
To implement MCP server authentication, select either an embedded or external authorization server model, configure OAuth 2.1 flows with PKCE support, and expose the required endpoints such as /authorize
, /token
, and optionally /register
for dynamic client registration.
Can I use existing authentication providers with MCP servers?
Yes, an MCP server can rely on external authorization servers like Auth0, Stytch, or WorkOS. These providers handle token issuance and validation while the MCP server enforces access control based on the token's scopes or claims.
As more AI models interact with real-world tools and services, the Model Context Protocol (MCP) has emerged as a way to connect large language models (LLMs) to APIs. MCP standardizes how LLMs access external tools, enabling structured tool calls through a consistent interface.
Many APIs require authentication to protect user data and restrict access. This makes authentication a critical part of building a secure MCP server. Whether the server accesses a public API, a private database, or user-specific cloud storage, it needs to authenticate on behalf of the LLM.
What is MCP server authentication?
The Model Context Protocol (MCP) is a JSON-RPC-based protocol that allows large language models to call tools hosted on external servers. These tools might wrap APIs, access files, or perform computations.
MCP server authentication is the process of verifying the identity of the user or agent making a request through an MCP client. It enables the MCP server to securely access third-party services on behalf of a user or session.
Authentication is separate from authorization. Authentication proves who the requester is. Authorization determines what actions they can perform. In the MCP lifecycle, authentication typically happens when the client connects to the server or when the server accesses upstream APIs.
Key components involved in MCP server authentication include:
MCP server: A program that hosts tools and exposes them to LLM-based clients via MCP
Authentication: The process of verifying identity, often using OAuth or API keys
Authorization: The process of determining permitted actions, typically scoped by access tokens
In real-world scenarios, authentication enables AI agents to perform tasks like reading emails, updating calendars, or querying databases securely. For example, an LLM might use an MCP server to access a user's GitHub account to create issues or pull requests. Authentication ensures that only authorized users can trigger those actions through tool calls.
Why OAuth 2.1 matters for MCP
OAuth 2.1 is the authentication standard used by MCP servers. It defines how clients prove their identity and obtain permission to access protected resources.
This standard was selected for MCP because it's widely adopted, familiar to developers, and includes updates that remove deprecated or less secure features from earlier versions.
In the context of MCP, OAuth 2.1 is used when an LLM client connects to a server that wraps a third-party API. The MCP server initiates the OAuth flow, obtains tokens from the authorization server, and uses those tokens to access the upstream API on behalf of the user.
Here's a simple example of an OAuth 2.1 flow in an MCP server:
// Example: MCP server using OAuth token to access external API const token = await getAccessTokenFromAuthServer(); const response = await fetch("https://api.example.com/user", { headers: { Authorization: `Bearer ${token}` } });
Core OAuth 2.1 flows
The most common flow used in MCP is the authorization code flow. This is designed for applications that can securely store secrets or communicate with a backend.
In this flow:
The client redirects the user to the authorization server
The user logs in and grants permission
The server returns an authorization code to the client
The client exchanges this code for an access token
The access token is then used to authenticate future requests.
Authentication is token-based. Once the client has a valid token, it includes it in the Authorization
header of the HTTP request.
PKCE for public clients
PKCE (Proof Key for Code Exchange) is a security extension to OAuth 2.1. It's required for public clients—clients that can't securely store secrets, such as desktop applications or browser-based tools.
PKCE prevents attackers from stealing authorization codes by requiring the client to generate a secret code challenge and verify it later during token exchange.
Benefits of PKCE:
Enhanced security: Attaches a one-time verifier to each request
Protection against attacks: Blocks interception of authorization codes
Compatibility: Works with public and confidential clients
Dynamic client registration
Dynamic client registration is defined in RFC 7591. It allows clients to register with an authorization server programmatically.
This mechanism helps MCP clients onboard with minimal manual setup. Instead of pre-registering each client in advance, the client sends a registration request and receives credentials in response.
Using API keys or secrets in MCP
MCP servers can authenticate to third-party APIs using either API keys or OAuth 2.1. Both methods allow the server to make authorized requests, but they differ in how credentials are issued, stored, and used.
API keys are static credentials. These are typically issued by the upstream API and embedded in the server's environment or configuration. OAuth 2.1 uses dynamically scoped tokens issued through a flow that involves user interaction or delegated authorization.
In MCP, API keys can be used when the upstream API doesn't support OAuth or when the authentication requirements are service-level only. OAuth 2.1 is used when user-level authorization is required or when access scopes need to be tightly controlled.
Authentication Method | Best For | Security Level | Implementation Complexity |
---|---|---|---|
API Keys | Service-to-service APIs | Low to Medium | Low |
OAuth 2.1 | User-authorized access via tokens | High | Medium to High |
Here's an example of using an API key in an MCP tool implementation:
// Using an API key to access an external weather API const API_KEY = process.env.WEATHER_API_KEY; async function getForecast(city) { const response = await fetch( `https://api.weather.com/v1/forecast?city=${city}`, { headers: { 'Authorization': `Bearer ${API_KEY}` } } ); return await response.json(); }
This approach works for simple integrations, especially when no user authentication is needed and the key doesn't expire frequently.
Embedded vs external authorization servers
MCP servers use OAuth 2.1 to authenticate clients and issue access tokens. There are two ways to implement this: the embedded approach and the external (delegated) approach.
In an embedded approach, the MCP server itself includes the logic and infrastructure to act as the authorization server. In a delegated approach, the MCP server relies on an existing external authorization service to handle OAuth responsibilities.
Embedded approach
In the embedded model, the MCP server acts as the authorization server. It manages user login, client registration, token issuance, and validation internally.
This means the server is responsible for implementing all required OAuth 2.1 endpoints, including support for PKCE, dynamic client registration, and metadata discovery.
Pros and cons of the embedded approach:
Pros: Full control over authentication, no external dependencies, easier to test locally
Cons: High implementation complexity, requires secure token storage, increases maintenance scope
Delegated approach
In the delegated model, the MCP server acts only as an OAuth resource server. It offloads authentication and authorization to a third-party service such as an identity provider.
Popular external providers include Auth0, Stytch, and WorkOS.
This setup separates concerns. The external provider owns identity and token lifecycle. The MCP server focuses on tool access and authorization enforcement.
Implementation considerations include:
Configuring token validation
Mapping token claims to internal permissions
Supporting OAuth 2.1 metadata endpoints via proxy
Comparing complexity and security
Factor | Embedded Approach | External Approach |
---|---|---|
Setup Complexity | High — requires implementing all OAuth logic | Medium — relies on standard OAuth configuration |
Maintenance | High — full ownership of token and login flows | Low — provider handles updates and compliance |
Security | Variable — depends on implementation quality | High — leverages mature infrastructure |
Scalability | Medium — constrained by server capabilities | High — benefits from provider infrastructure |
Handling large APIs and dynamic endpoints
Some APIs include hundreds of endpoints. Converting all of them into MCP tools at once introduces performance and usability problems. MCP servers currently load all tool schemas into a shared context, and the client relies on the LLM to choose which tool to use.
LLMs have context limits, which means they can only consider a certain number of tokens at once. If too many tools are loaded, the LLM may ignore some or fail to select the right one.
Partial endpoint loading
To work around these issues, MCP servers can load only a subset of tools based on the task or user input. Instead of registering every tool at startup, the server selectively exposes tools as needed.
Common strategies include:
Context-based loading: The server loads tools based on the current conversation
User intent filtering: The server infers user intent and loads matching tools
Dynamic discovery: The client explicitly requests tool definitions before use
These approaches reduce memory use and align with LLM context constraints.
Reference resolution at scale
Large OpenAPI specifications often include many $ref
references. These are links to reusable schema components defined elsewhere in the spec. MCP requires self-contained tool schemas, so these references must be resolved before tool registration.
Resolving $ref
at scale involves inlining all referenced schema definitions. To avoid circular references or excessive schema expansion, the process includes detecting cycles and deduplicating identical definitions.
Here's a simplified example of resolving a reference:
// Original schema with $ref { "type": "object", "properties": { "user": { "$ref": "#/components/schemas/User" } } } // After reference resolution { "type": "object", "properties": { "user": { "type": "object", "properties": { "id": { "type": "string" }, "email": { "type": "string", "format": "email" } }, "required": ["id", "email"] } } }
This expanded structure is what the MCP server uses to register the tool.
LLM-based clients and MCP
Large language models interact with MCP servers by acting as clients. These clients connect to the server and send structured requests to call tools. The MCP server processes the request and sends back a response that the LLM can use in its output.
Authentication is required when an MCP server wraps protected resources, such as user accounts or third-party APIs. The LLM client must prove its identity or the identity of the user it represents.
Token issuance for AI agents
AI agents acting as MCP clients receive access tokens by completing an OAuth 2.1 flow. This flow may involve redirecting the user to log in with a third-party service and grant permission.
After the user consents, the AI agent receives an access token that it includes in the Authorization header when making requests to the MCP server.
Security considerations for AI agents:
AI agents may run in environments that can't securely store long-lived secrets
Tokens might be cached in memory or logs, creating risk if not handled properly
Public clients must use PKCE to prevent token interception
Selecting tools dynamically
Once authenticated, an LLM client can access a list of available tools from the MCP server. The list of tools may vary depending on the access token. The server filters tools based on the scopes or permissions encoded in the token.
When the LLM receives a user prompt, it decides whether to call a tool. If tools are dynamically selected, the client checks which tools are currently available and allowed for the session.
For example, a token might include permission for the "calendar" and "email" scopes. The MCP server will expose tools related to scheduling and email handling. Tools outside those scopes, such as "file_upload", won't appear in the client's tool list.
Looking ahead to streamlined integration
MCP authentication continues to evolve as the protocol becomes more widely adopted. The current specification defines the MCP server as both a resource server and, optionally, an authorization server.
Future updates to the MCP specification may offer clearer separation of responsibilities. There are active discussions about allowing MCP servers to fully delegate authentication to external providers, rather than implementing authorization server features directly.
At Stainless, we generate MCP servers with built-in support for OAuth 2.1, PKCE, and bearer token validation. Servers created through our platform expose properly scoped authorization metadata and include support for external authorization servers when needed.
This allows developers to align with current spec requirements without manually implementing OAuth endpoints.
FAQs about MCP server authentication
How do I fix server authentication failed errors in MCP?
Server authentication failures in MCP usually occur when the OAuth configuration is incorrect or when access tokens are no longer valid. Check your client credentials, verify redirect URIs, and ensure your authorization server is properly configured with valid endpoints.
How do I implement MCP server authentication for my API?
To implement MCP server authentication, select either an embedded or external authorization server model, configure OAuth 2.1 flows with PKCE support, and expose the required endpoints such as /authorize
, /token
, and optionally /register
for dynamic client registration.
Can I use existing authentication providers with MCP servers?
Yes, an MCP server can rely on external authorization servers like Auth0, Stytch, or WorkOS. These providers handle token issuance and validation while the MCP server enforces access control based on the token's scopes or claims.