Atty Eleti's Twitter Thread

Introducing the Responses API: the new primitive of the OpenAI API. It is the culmination of 2 years of learnings designing the OpenAI API, and the foundation of our next chapter of building agents. 🧵Here’s the story of how we designed it:

But first, a trip down memory lane: two years ago, we launched Chat Completions alongside GPT-3.5 Turbo. @_rlys and I built it over a weekend (literally): it was designed on a Friday, and GA’d on Tuesday. Today, it is the de-facto industry standard, powering hundreds of thousands of apps, and adopted by every major model provider.

Later that year, we launched a beta of the Assistants API, our first draft at building an agentic primitive. Runs would happen in the background, calling tools as needed. Many developers loved it for the ease of getting started (just use OpenAI as your DB!) and powerful RAG built-in via the `file_search` tool.

But a lot has changed since then: today’s models are multimodal (text, image, audio), agentic (call one or more tools), and think before they talk. Chat Completions was not designed for this; it is stateless (forcing you to pass heavy image and audio back), does not support tools, and has many usability issues (in particular, streaming is really hard to get right.)

Assistants supported tools, but in retrospect, was quite over-abstracted. You needed to know a half dozen concepts to get started, and background processing meant it was slow by default. We knew you all wanted the underlying capabilities, but the API shape was getting in the way.

Enter Responses: a delightful API that combines the simplicity of Chat Completions with the power of Assistants. Just 4 lines to get started, but jam packed with features like file search, web search, function calling, and structured outputs just a parameter away. (Fans of form-encoded inputs rejoice; the OpenAI API supports it now.)

Responses are stateful in more ways than one. All Responses are stored by default, allowing you to view them in the Dashboard for later debugging. You can use `previous_response_id` to continue a conversation — no need to send large payloads again and again. Responses are also state-machines, better modeling incomplete, interrupted, and failed model outputs.

Items are the core concept of Responses: polymorphic objects representing user inputs or model outputs. Items can represent messages, reasoning, function calls, web search calls, and so on. Where Chat Completions was list-of-messages-in-one-message-out, Responses is list-of-items-in-list-of-items-out. (We also switched from externally-tagged polymorphism to internally-tagged polymorphism, flattening our shapes by half.)

Hosted tools are the killer feature of Responses. With just one-line of code, you can get best-in-class web search, file search, and soon code interpreter into your app. (We also launched the standalone Vector Stores Search endpoint today, allowing you to use OpenAI’s RAG with any model or provider.)

Streaming is completely redesigned for Responses. Our previous APIs followed the “delta” pattern: we emitted JSON objects that were diffs between the previous version and the new version. This wasn’t typesafe and really hard to integrate against correctly. Responses supports “semantic events” — well-named events that tell you exactly what changed, like `response.output_text.delta`.

Ok, so, the name: Responses obviously conflicts with HTTP Responses. But we strongly believe this name is the perfect balance of elegance and descriptiveness. We all say “what was the model response?” in daily use. Other names we considered: Tasks, Generations, Messages, Interactions, Conversations, and a dozen more.

Lots more tiny big details, but this thread is already too long: - SDKs have `response.output_text` to quickly get the text out! - `n` choices is gone; no more `completion.choices[0].message`! - `finish_reason` is gone; `status` is much more expressive. - Function-calling and structured outputs are `strict` by default. (Let me know below if you spot any others!)

All that said, Chat Completions isn’t going anywhere. It’s a workhorse for thousands of businesses, and we are committed to supporting new models and features for years to come. Our #1 responsibility is to provide a stable and reliable API for our customers, and that will never change.

Our API design philosophy at OpenAI deserves its own blog post one day, but we have a set of principles we work from. Ship capabilities, not abstractions. Simple things should be simple, complex things should be possible. But easily my favorite is API-as-ladders, by @sbensu. Great APIs are ladders; they let you put in a bit of effort for a bit of reward: https://blog.sbensu.com/posts/... Responses was designed with this in mind from the very beginning.

Responses was brought to you by the brilliant folks behind it: @stevendcoffey, @nikunjhanda, Rasmus, @brianyu8, Wei, @zedlander, Gireesh, Bo, Baishen, Prashanth, Erin, @IshaanSingal, @kevinwhinnery, @liodikas, and the village and family that is the OpenAI API team. We hope you love it.

Working on the OpenAI API is a real privilege because we get to serve all of you — the incredible developers building the future of our industry. Thank you for trusting us with your business and showing us the way. Get started here: https://platform.openai.com/do...

Atty Eleti

Share this thread

Read on Twitter

Navigate thread