How Do REST APIs Work: The Developer Guide Explained

How does REST API work? Understand how clients send HTTP requests to servers, explore the request lifecycle, core methods, and response handling.

Jump to section

Jump to section

Jump to section

REST APIs power most modern web applications, but understanding how they actually work—from DNS lookup to response parsing—helps you build better integrations and debug issues faster. This guide traces a complete API call lifecycle, covers the core HTTP methods and REST principles, and shows how to map your endpoints to MCP tools for AI agents.

You'll learn how to construct well-formed requests, interpret responses correctly, debug common problems systematically, and leverage tools like OpenAPI specs and SDK generators to improve your API development workflow.

Trace a REST API call from client to server

A REST API works by a client sending an HTTP request to a server to access or manipulate a resource. The server processes this request, performs an action like retrieving data from a database, and sends back an HTTP response containing a status code and, often, a JSON payload. This entire exchange follows a stateless, client-server model, using standard HTTP methods like GET, POST, and DELETE to define the desired action on the resource, which is identified by a unique URL.

Let's trace a single API call from start to finish. When a user's action in an application triggers a request, it kicks off a multi-step journey across the internet and back.

First, the client's machine performs a DNS lookup to translate the server's domain name, like api.stainlessapi.com, into an IP address. With the address found, the client establishes a TCP connection with the server through a three-way handshake, creating a reliable channel for data. For security, a TLS handshake follows, encrypting all subsequent communication.

Now the client sends the actual HTTP request. This plain text message specifies the method (e.g., GET), the path (/users/123), headers (like authentication tokens), and an optional body (for POST or PUT requests).

# A raw curl request to fetch a user
curl -X GET "https://api.example.com/v1/users/123" \
  -H "Authorization: Bearer sk_12345" \
  -H "Content-Type: application/json"

The server receives the request, routes it to the correct code based on the method and path, and executes its business logic. This might involve querying a database, calling other services, or preparing a file.

Finally, the server crafts an HTTP response. It includes a status code (e.g., 200 OK), response headers (like Content-Type), and a body, typically in JSON format. This response travels back to the client, which then parses it to display the result or handle any errors.

A quality SDK abstracts away this complexity. The same request becomes a single, intuitive function call, handling authentication, headers, and response parsing automatically. The Stainless SDK generator produces libraries that handle these details elegantly.

import { MyApiClient } from '@myorg/my-api'; // from a Stainless-generated package

// Instantiating the Stainless-generated client
const client = new MyApiClient({
  authToken: 'sk_12345', // Often taken from env or a config object
});

// Using the SDK to fetch the same user
const user = await client.users.retrieve('123');

console.log('Retrieved user:', user);

Build a well-formed request

The client is responsible for constructing a request the server can understand. This involves choosing the right HTTP method, structuring the data correctly, and providing authentication. Each part of the request signals a specific intent to the server.

Use GET for reads

The GET method is used to retrieve data. It should be safe, meaning it doesn't change the server's state, and idempotent, meaning multiple identical requests have the same effect as one. You pass parameters for filtering or sorting in the URL's query string.

  • Pagination: To manage large datasets, use query parameters like limit and offset or a cursor to fetch data in chunks.

  • Filtering: To narrow results, use parameters like status=active or created_after=2024-01-01.

Use POST for creates

Use POST to create a new resource. The data for the new resource is sent in the request body, typically as a JSON object. POST requests are not idempotent; making the same request twice will create two separate resources.

A successful POST request should return a 201 Created status code and a Location header pointing to the URL of the newly created resource.

Use PUT for replacements

The PUT method is used to completely replace an existing resource. You must send the entire resource representation in the request body. If the resource doesn't exist, PUT can be used to create it, making it useful for upsert (update or insert) operations.

PUT is idempotent. Calling it multiple times with the same payload on the same resource will result in the same final state.

Use PATCH for partial updates

When you only need to change a few fields on a resource, use PATCH. This is more efficient than PUT because you only send the data that needs to change. PATCH is not always idempotent, as its effect can depend on the resource's current state.

Use DELETE for removals

The DELETE method removes a specified resource. A successful deletion usually returns a 204 No Content status code with an empty body. Like GET and PUT, DELETE is idempotent; deleting the same resource multiple times has the same outcome as deleting it once.

Add authentication to every call

Servers need to know who is making a request. Common authentication schemes include sending an API key or a Bearer token in the Authorization header. Modern SDKs simplify this by letting you set the key once during client initialization, often reading it securely from an environment variable. Generated SDKs support adding custom code for specialized authentication flows or other business logic.

Attach data to the body correctly

When sending data with POST, PUT, or PATCH, you must specify its format using the Content-Type header. For structured data, application/json is the standard. For file uploads, multipart/form-data is used, which allows you to send binary data alongside other fields.

Parse and interpret the response

Once the server processes the request, it sends back a response. A robust client application needs to correctly interpret every part of this response, from the status code to the headers and body, to handle both success and failure gracefully.

Read the status line first

The HTTP status code is the first thing you should check. It provides a quick summary of the outcome.

Code Range

Meaning

Developer Action

2xx

Success

The request was successful. Parse the response body.

4xx

Client Error

There's an issue with the request (e.g., bad data, missing auth). Fix and retry.

5xx

Server Error

Something went wrong on the server. Wait and retry the request later.

Inspect headers for metadata

Response headers contain important metadata that isn't part of the main payload.

  • Rate-Limit: Headers like X-RateLimit-Limit and X-RateLimit-Remaining tell you how many requests you can make in a given window.

  • Pagination: Headers or a structured body can contain links or cursors to the next and previous pages of data.

  • Caching: ETag and Cache-Control headers help clients cache responses efficiently to reduce redundant requests.

Handle errors with structured logic

When a 4xx or 5xx error occurs, the response body often contains a structured JSON object with details about the error. Parsing this object allows you to provide specific feedback to the user or log detailed diagnostics. For transient server errors, a good client will automatically retry the request with exponential backoff, using an idempotency key to prevent duplicate operations.

Follow the five rest principles in real code

REST is more than just using HTTP verbs; it's an architectural style guided by principles that lead to scalable and maintainable APIs. Understanding these principles helps you build and consume APIs more effectively.

Keep requests stateless

Each request from a client must contain all the information needed for the server to fulfill it. The server does not store any client session state between requests. Authentication is typically handled by sending a token with every call.

Separate client and server concerns

The client and server are independent. The client only knows the resource URIs, while the server handles the business logic and data storage. This separation allows them to evolve independently; you can change the server's internal implementation without breaking the client, as long as the API contract remains the same.

Expose a uniform interface

A uniform interface simplifies the architecture and makes the API more predictable. This is achieved through four sub-constraints: resources are identified by URIs, resources are manipulated through their representations (like JSON), messages are self-descriptive (e.g., using Content-Type), and hypermedia guides the client's actions (HATEOAS).

Enable smart caching where possible

Responses should declare whether they are cacheable or not. This allows clients and intermediaries to reuse response data for equivalent later requests, improving performance and scalability. Caching is typically managed with headers like Cache-Control and ETag.

Layer your architecture for scale

A client may not be directly connected to the end server. It might be communicating with an intermediary layer, like a load balancer, cache, or API gateway. This layered system allows for better scalability, security, and management, and the client remains unaware of the underlying complexity.

Debug REST API problems faster

Debugging API issues is a core engineering task. A systematic approach can save hours of frustration and quickly pinpoint the root cause, whether it's in the client's request, the server's logic, or the network in between.

Capture the failing request

The first step is to see the exact HTTP request being sent. You can use your browser's developer tools Network tab or a command-line tool like curl --verbose to inspect the full request, including the method, URL, headers, and body.

Compare with a known-good sample

Generate a successful request to the same endpoint using a tool like Postman or an API development environment. Comparing the failing request to the working one often reveals subtle differences, like a misspelled header or an incorrect data format. To prevent these issues, consider integrating SDK snippets into your API documentation to provide copy-ready examples that work correctly.

Drill into headers and payloads

Scrutinize the details. Common culprits include a missing or incorrect Authorization header, a mismatched Content-Type, or a JSON payload with incorrect field names or data types. Pay close attention to casing, as some servers are case-sensitive.

Replicate with automated tests

Once you identify the issue, create an automated test that replicates the failure. This not only confirms the fix but also prevents regressions in the future. Integrating these tests into a CI/CD pipeline ensures ongoing API stability.

Instrument retries and timeouts

For production issues, check your client's configuration. Overly aggressive retry logic can overwhelm a struggling server, while a timeout that is too short can cause requests to fail unnecessarily. Modern SDKs allow you to configure these settings at the client level.

Map REST endpoints to MCP tools for AI agents

The Model Context Protocol (MCP) is an open standard that lets AI agents interact with APIs. By wrapping your REST API in an MCP server, you can expose its functionality as "tools" that an LLM can discover, understand, and execute to perform tasks on a user's behalf.

Generate MCP servers from OpenAPI

If you have an OpenAPI specification for your REST API, you can automatically generate a baseline MCP server. For APIs without existing specifications, various tools assist in creating OpenAPI specs from your implementation. This process maps each API endpoint to a corresponding MCP tool, translating HTTP parameters into a structured input schema for the LLM.

Expose only the safe tools

Not all API endpoints are suitable for AI agents, especially those that are destructive or handle sensitive data. A well-designed MCP server allows you to selectively expose endpoints, using configuration to enable specific resources or tag endpoints for different use cases, like "read-only" or "admin".

Transform schemas for client capabilities

Your MCP server may need to dynamically transform tool schemas, such as inlining references or simplifying union types, to ensure compatibility with clients like Cursor, Claude, or OpenAI Agents. When converting APIs to MCP, these schema transformations require careful consideration.

Deploy remote servers with Oauth

For web-based AI clients, local MCP servers that rely on API keys are not practical. The solution is a remote MCP server that supports OAuth for secure, user-delegated authentication. You can deploy these as serverless functions, using templates to handle the OAuth flow and securely manage credentials.

Frequently asked questions about REST APIs

What are the five basic REST principles?

The five core principles are statelessness, client-server separation, a uniform interface, cacheability, and a layered system architecture. Following these guidelines helps create APIs that are scalable, reliable, and easy for developers to work with.

How does CRUD map to the four HTTP operations?

CRUD operations (Create, Read, Update, Delete) map directly to standard HTTP methods. POST is used for Create, GET for Read, PUT or PATCH for Update, and DELETE for Delete.

Can you show a real REST API in production?

The GitHub API is a classic example. To fetch an issue, you would send a GET request to https://api.github.com/repos/{owner}/{repo}/issues/{issue_number}. A quality SDK simplifies this to a call like github.issues.get({ owner, repo, issue_number }).

How do you manage state without breaking statelessness?

REST is stateless from the server's perspective, meaning the server doesn't store client session information. State is managed on the client and sent with each request, typically through an Authorization token that identifies the user and their permissions.

When should you use PUT versus PATCH?

Use PUT when you want to completely replace a resource with a new representation. Use PATCH for partial updates when you only need to modify a few fields, which is more network-efficient.