OpenAI opens most powerful mode o1 to third-party developers
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
On the ninth day of its holiday-themed stretch of product announcementss known as “12 Days of OpenAI,” OpenAI is rolling out its most advanced model, o1, to third-party developers through its application programming interface (API).
This marks a major step forward for devs looking to build new advanced AI applications or integrate the most advanced OpenAI tech into their existing apps and workflows, be they enterprise or consumer-facing.
If you aren’t yet acquainted with OpenAI’s o1 series, here’s the rundown: It was announced back in September 2024, the first in a new “family” of models from the ChatGPT company, moving beyond the large language models (LLMs) of the GPT-family series and offering “reasoning” capabilities.
Basically, the o1 family of models — o1 and o1 mini — take longer to respond to a user’s prompts with answers, but check themselves while they are formulating an answer to see if they’re correct and avoid hallucinations. At the time, OpenAI said o1 could handle more complex, PhD-level problems — something borne out by real world users, as well.
While developers previously had access to a preview version of o1 on top of which they could build their own apps — say, a PhD advisor or lab assistant — the production-ready release of the full o1 model through the API brings improved performance, lower latency, and new features that make it easier to integrate into real-world applications.
OpenAI had already made o1 available to consumers through its ChatGPT Plus and Pro plans roughly two and a half weeks ago, and added the capability for the models to analyze and respond to imagery and files uploaded by users, too.
Alongside today’s launch, OpenAI announced significant updates to its Realtime API, along with price reductions and a new fine-tuning method that gives developers greater control over their models.
The full o1 model is now available to developers through OpenAI’s API
The new o1 model, available as o1-2024-12-17, is designed to excel at complex, multi-step reasoning tasks. Compared to the earlier o1-preview version, this release improves accuracy, efficiency, and flexibility.
OpenAI reports significant gains across a range of benchmarks, including coding, mathematics, and visual reasoning tasks.
For example, coding results on SWE-bench Verified increased from 41.3 to 48.9, while performance on the math-focused AIME test jumped from 42 to 79.2. These improvements make o1 well-suited for building tools that streamline customer support, optimize logistics, or solve challenging analytical problems.
Several new features enhance o1’s functionality for developers. Structured Outputs allow responses to reliably match custom formats such as JSON schemas, ensuring consistency when interacting with external systems. Function calling simplifies the process of connecting o1 to APIs and databases. And the ability to reason over visual inputs opens up use cases in manufacturing, science, and coding.
Developers can also fine-tune o1’s behavior using the new reasoning_effort parameter, which controls how long the model spends on a task to balance performance and response time.
OpenAI’s Realtime API gets a boost to power intelligent, conversational voice/audio AI assistants
OpenAI also announced updates to its Realtime API, designed to power low-latency, natural conversational experiences like voice assistants, live translation tools, or virtual tutors.
A new WebRTC integration simplifies building voice-based apps by providing direct support for audio streaming, noise suppression, and congestion control. Developers can now integrate real-time capabilities with minimal setup, even in variable network conditions.
OpenAI is also introducing new pricing for its Realtime API, reducing costs by 60% for GPT-4o audio to $40 per one million input tokens and $80 per one million output tokens.
Cached audio input costs are reduced by 87.5%, now priced at $2.50 per one million input tokens. To further improve affordability, OpenAI is adding GPT-4o mini, a smaller, cost-efficient model priced at $10 per one million input tokens and $20 per one million output tokens.
Text token rates for GPT-4o mini are also significantly lower, starting at $0.60 for input tokens and $2.40 for output tokens.
Beyond pricing, OpenAI is giving developers more control over responses in the Realtime API. Features like concurrent out-of-band responses allow background tasks, such as content moderation, to run without interrupting the user experience. Developers can also customize input contexts to focus on specific parts of a conversation and control when voice responses are triggered for more accurate and seamless interactions.
Preference fine-tuning offers new customization options
Another major addition is preference fine-tuning, a method for customizing models based on user and developer preferences.
Unlike supervised fine-tuning, which relies on exact input-output pairs, preference fine-tuning uses pairwise comparisons to teach the model which responses are preferred. This approach is particularly effective for subjective tasks, such as summarization, creative writing, or scenarios where tone and style matter.
Early testing with partners like Rogo AI, which builds assistants for financial analysts, shows promising results. Rogo reported that preference fine-tuning helped their model handle complex, out-of-distribution queries better than traditional fine-tuning, improving task accuracy by over 5%. The feature is now available for gpt-4o-2024-08-06 and gpt-4o-mini-2024-07-18, with plans to expand support to newer models early next year.
New SDKs for Go and Java developers
To streamline integration, OpenAI is expanding its official SDK offerings with beta releases for Go and Java. These SDKs join the existing Python, Node.js, and .NET libraries, making it easier for developers to interact with OpenAI’s models across more programming environments. The Go SDK is particularly useful for building scalable backend systems, while the Java SDK is tailored for enterprise-grade applications that rely on strong typing and robust ecosystems.
With these updates, OpenAI is offering developers an expanded toolkit to build advanced, customizable AI-powered applications. Whether through o1’s improved reasoning capabilities, Realtime API enhancements, or fine-tuning options, OpenAI’s latest offerings aim to deliver both improved performance and cost-efficiency for businesses pushing the boundaries of AI integration.
Read Full Article