OPENAI LAUNCHES THREE VOICE MODELS FOR API

AI DESK■ 2 MIN READ

THU, MAY 7, 2026

■ AI-SUMMARIZED FROM 5 SOURCES ▸ TIMELINE

OpenAI released three new realtime voice models designed to enable developers to build a new class of voice applications. The models include GPT-Realtime-2 with GPT-5-class reasoning, GPT-Realtime-Whisper for transcription, and GPT-Realtime-Translate for multilingual support.

GPT-Realtime-2 brings advanced reasoning capabilities to voice interactions, matching GPT-5 performance in real-time conversations. The model enables developers to build applications that can understand and respond to complex spoken queries without the traditional latency associated with voice processing. GPT-Realtime-Whisper handles live speech transcription, converting audio input to text in real time. This component allows for immediate processing of spoken input across applications. GPT-Realtime-Translate supports translation across 70+ languages, enabling developers to build multilingual voice applications. The feature allows real-time conversation across language barriers without separate processing steps. OpenAI stated the models will "unlock a new class of voice apps for developers." The release targets the API, making the technology available to third-party developers building commercial and internal applications. In parallel, OpenAI launched Trusted Contact, an optional safety feature for ChatGPT. The feature allows users over 18 to assign an emergency contact for mental health and safety concerns. The capability expands existing teenage safety options to adult users, enabling designated contacts to be notified when the platform detects potential safety issues. The Trusted Contact feature is optional and gives users control over their safety settings. When activated, designated contacts can receive alerts if OpenAI's systems detect concerning behavior patterns. Both announcements reflect OpenAI's focus on expanding voice capabilities while strengthening user safety features. The voice models target developers seeking to integrate advanced audio processing into applications, while the Trusted Contact feature addresses growing concerns about digital safety and mental health support.

■ SOURCES

► Hacker News ► Hacker News ► The Verge ► Techmeme ► The Decoder

■ SUMMARY WRITTEN BY AI FROM THE LINKS ABOVE

■ MORE FROM THE AI DESK

P527ANTHROPIC ADDRESSES AI EXTINCTION RISKS IN PUBLIC FORUM

Anthropic co-founder Jack Clark and head of economics Peter McCrory discussed the company's approach to existential AI risks and economic impacts during a Bloomberg podcast episode.

JUST NOW— AI Desk

P525CLAUDE'S EXTENDED THINKING ISN'T REAL REASONING

A technical analysis claims Anthropic's Claude Code feature produces summaries rather than genuine extended thinking. The critique has sparked significant debate in developer communities.

JUST NOW— AI Desk

P520LILLY LAUNCHES BIOTECH APP STORE WITH $7.3B

Eli Lilly is deploying its $7.3 billion cash reserve to build an "App Store" for biotech scientists, backed by a new data center featuring 1,016 Nvidia Blackwell chips launching in 2025.

JUST NOW— Industry Desk

P519GLM 5.2 TAKES ON ANTHROPIC'S OPUS IN NEW AI SHOWDOWN

A detailed comparison of GLM 5.2 and Opus has sparked significant discussion in the tech community, with 191 upvotes and 160 comments on Hacker News. The analysis examines how these two AI models stack up across performance metrics.

4H AGO— Industry Desk

◄ BACK TO NEWS

OPENAI LAUNCHES THREE VOICE MODELS FOR API

■ MORE FROM THE AI DESK

■ SUBSCRIBE TO THE DAILY BRIEF