“Working with Stainless felt like having an extension of our team.”
How Weights & Biases increased developer focus, reduced overhead, and added several in-demand languages
TL;DR: Weights & Biases faced substantial demand from their customers to create more SDKs for Weave, their LLM evaluation tool. The third-party libraries for their platform had quality issues, and maintaining more SDKs in-house was unsustainable for their team of eight. After a month-long structured assessment comparing Stainless and Speakeasy, they chose Stainless for superior ergonomics, more compact bundle sizes, and better reliability.
Reliability is the top priority. It needs to work every time, with clear logging and contextual errors, and we trusted Stainless to deliver that.
Weights & Biases is an AI platform to train and fine-tune models. Their newest product, Weave, helps developers monitor, debug, and improve their AI applications. Weave initially launched with a Python SDK, but the team was quickly flooded with requests to support more languages. They needed a more efficient solution to produce idiomatic client libraries rather than writing them in-house.
Since Weave's server API was built with FastAPI, they already had an OpenAPI spec that could be used to generate SDKs for TypeScript, Go, Java, and C#. This led the team to explore SDK generators as a solution.
Andrew Truong, staff machine learning engineer and the DRI for this project, established several critical requirements based on their existing SDK challenges:
- Rock-solid reliability - SDKs must update automatically without breaking customer integrations.
- Architectural flexibility - Generated SDKs need to mount as submodules under their existing package structure.
- FastAPI compatibility - Complex OpenAPI specs, including types like
dict[str, Any]should generate correctly. - Superior bundle size - Minimal package size is critical for browser-based application deployments.
- Idiomatic code generation - SDKs need to feel native and natural in every supported language.
The month-long structured evaluation: Stainless vs. Speakeasy
For about a month, Andrew conducted a thorough head-to-head comparison between Stainless and Speakeasy across 11 technical criteria. The evaluation included multiple discovery calls, technical deep-dives, pairing sessions, and hands-on testing of both platforms by comparing the generated code across both platforms.
The table below is the raw rubric that Andrew used for his evaluation. Lines in purple are the only additions we’ve made.
Several factors made Stainless the clear choice for Andrew’s team. The bundle size and performance advantages are critical for Weights and Biases’ browser-deployed TypeScript application. Beyond that, Stainless’ superior ergonomics, reliability, built-in tests, architectural flexibility, and compatibility made the choice obvious.
"Stainless moved fast, took feedback seriously, and delivered exactly what we needed—including the most idiomatic SDKs we saw across any generator.”
Following Andrew's evaluation, Weights and Biases selected Stainless and began implementation planning. While the team still needs to add custom functionality like batch processing on top of the generated SDKs, Stainless handles the foundational multi-language SDK generation, allowing the team to focus more on core product development.
Stainless is also working closely with the Weights and Biases team to execute a phased rollout of their new SDKs - starting with TypeScript and Go, then Java with React hooks, with C# support to follow.
Business outcomes
- Engineering velocity - Andrew’s team can focus more on core products instead of manual SDK maintenance across multiple languages.
- Market expansion - Weights and Biases can now serve developers in Go, Java, and C# ecosystems who were previously unable to integrate with the platform.
- Automated maintenance - API changes from FastAPI automatically propagate to all language SDKs without manual intervention.
- Architectural integration - Generated SDKs integrate seamlessly with their existing package structure through custom submodule mounting.
"The North Star for us is simple: whether you're new to the SDK or you know it inside and out, you should feel genuinely happy using it."

