How turbopuffer boosted SDK performance while minimizing engineering cost

Stainless generated libraries:

turbopuffer, a high-performance search engine, migrated from maintaining two hand-written SDKs to five Stainless-generated SDKs. The migration delivered production-ready clients in three new languages — including compiled options like Java and Go for performance-sensitive apps — async support in Python, and real-world performance across the board, with TypeScript and Python clients showing 5–50% gains in benchmarks simulating real workloads.

“I can't say enough good things about Stainless. They were incredibly responsive throughout the whole engagement and seriously considered all our requests.”

Nikhil Benesch

Software Engineer - turbopuffer

“I can't say enough good things about Stainless. They were incredibly responsive throughout the whole engagement and seriously considered all our requests.”

Nikhil Benesch

Software Engineer - turbopuffer

The breaking point: complex API patterns at scale

These challenges in particular made the process unsustainable:

Cross-language drift: Keeping multiple SDKs in sync meant duplicating work for every schema change, with subtle differences in response handling creating inconsistencies.
Language coverage: Supporting five languages required expertise across very different ecosystems, slowing down feature delivery and testing.
Performance optimization overhead: Each runtime needed its own optimizations, and maintaining those consistently added overhead to each new feature that shipped.

As enterprise adoption grew, so did the cost of SDK maintenance. Supporting just TypeScript and Python required significant effort, and the prospect of adding more languages felt daunting. Each new feature meant duplicating work across multiple codebases, slowing release cycles and pulling engineers away from core product development.

Project management approach

turbopuffer’s migration was a tightly scoped engineering effort focused on performance and accuracy. Stainless engineers worked directly with turbopuffer’s team, resolving regressions, design questions, and edge cases in real-time, often within hours, to keep development unblocked.

Implementation phases

Phase one: feasibility & benchmarking

The first phase focused on validating whether Stainless could meet turbopuffer’s core requirements before a full migration. Two questions guided the work: could the generated SDKs deliver sufficient performance on real workloads, and could they support turbopuffer’s existing client.namespace(ns).operation(...) style API without breaking compatibility?

Early benchmarks focused on Python, where the team needed to process 100 MB+ payloads through latency-sensitive pipelines. Switching from httpx to aiohttp improved throughput under heavy concurrency, and integrating orjson reduced serialization overhead that had been inflating tail latency. These results established that performance targets were achievable in a generated client.

The other major focus was preserving turbopuffer’s existing API ergonomics. Stainless and turbopuffer engineers collaborated to prototype a shared-client approach that maintained the namespace pattern, and Stainless added withOptions support in the Java SDK to extend this design consistently across languages.

Phase two: complex schema support

Next, the focus shifted to filling OpenAPI gaps that blocked code generation. A major one was tuple array support, heavily used in turbopuffer’s filtering and ranking logic. turbopuffer addressed this by plugging in custom code to handle these complex types, using Stainless’ “escape hatch” mechanism to extend generated SDKs as needed. These customizations live directly in the SDK repos and automatically persist across future codegen runs.

This approach let the team accurately model complex data structures that an OpenAPI spec can't express, removing the need for brittle hand-written types.

Language-specific improvements also shipped in this phase. In Go, union types were renamed using subtle_union_naming for idiomatic clarity, ensuring the SDKs felt natural and robust in each language.

Phase three: multi-language rollout

Once the core SDK model was stable, the focus shifted to production-level performance and cross-language rollout. A key breakthrough came from replacing Node’s default fetch with an Undici-based backend (a lower-level, high-performance alternative), which resolved event-loop stalls under heavy load. This change improved query_scale benchmarks from 3,436 ms to 2,778 ms (1.24× faster).

With these optimizations in place, consistency across Python, TypeScript, Go, Java, and Ruby became a matter of automation. New SDKs could now ship in under 24 hours. A fully functional Ruby client was delivered a day after it was requested while maintaining identical behavior and performance guarantees across runtimes, for example.

Phase four: production deployment

The final phase focused on operationalizing the new pipeline. Stainless integrated directly into turbopuffer’s CI with GitHub Actions that automatically rebuild and merge SDKs whenever their OpenAPI spec changes.

This ensures that performance patches, custom logic, and new API features are always reflected in generated clients without any intervention. They now enjoy a fully automated release process: every PR provides a preview of SDK diffs, validates changes against benchmarks, and merges updates into production once approved.

All five SDKs reached v1.0 production readiness with minimal breaking changes. Release cycles that once took weeks were reduced to days, giving turbopuffer predictable and quick client delivery.

Performance results

Performance validation was the final milestone of the migration. turbopuffer benchmarked the generated SDKs using workloads representative of real production traffic. Across all metrics, the new clients met or exceeded their hand-maintained counterparts, often by wide margins.

Python SDK Performance

Sync performance benchmarks

Operation	Baseline	Final SDK (opt3)	Improvement
`upsert`	1.88 s	1.13 s	1.66× faster
`query`	265 ms	244 ms	1.08× faster
`query_scale`	2.68 s	3.32 s	1.24× slower (expected due to iteration overhead)*

*Note: Async performance was prioritized for turbopuffer’s ingestion workloads, where throughput and concurrency were more impactful than individual sync operation speed. The slower query_scale sync result is less representative of real-world usage and considered acceptable given the broader performance gains and improved iterator behavior.

Async performance benchmarks

Operation	Baseline	Final SDK (opt3)	Improvement
`query_scale`	4.34 s	2.57 s	1.69× faster

TypeScript SDK Performance

Benchmark scenario	Baseline (fetch)	Undici backend	Improvement
`query_scale` (standard)	3,436 ms	2,778 ms	1.24× faster
`query_scale` (real-world)	2,551 ms	2,422 ms	1.05× faster

Note: Replacing the default fetch stack with an Undici-based backend eliminated Node.js event-loop stalls observed under load and significantly improved streaming throughput at high concurrency. In more representative production-like benchmarks, rather than synthetic stress tests, the new SDK even outperformed the hand-tuned baseline.

Business impact

The migration fundamentally reshaped how turbopuffer builds and ships its client libraries. New features propagate across all SDKs almost immediately, eliminating the lag between backend changes and client availability. turbopuffer's engineers are now free to focus on advancing their core product rather than maintaining clients.

This new foundation also improved reliability and increased adoption. All five SDKs reached v1.0 production readiness with minimal breaking changes, 100% documentation coverage, and immediate enterprise deployment. Consistency across languages will continue to reduce integration friction for customers and lower support overhead.

“Stainless was a great design partner, and I felt like we did a really good job finding solutions that minimized work on both sides. It was only possible because they have such a talented engineering team.”

Nikhil Benesch

Software Engineer - turbopuffer

“Stainless was a great design partner, and I felt like we did a really good job finding solutions that minimized work on both sides. It was only possible because they have such a talented engineering team.”

Nikhil Benesch

Software Engineer - turbopuffer