OpenAI Unveils Three Cutting-Edge Real-Time Voice Models, Elevating Voice Interaction to New Heights

5 hour ago / Read about 0 minute

Author：小编

On May 8, 2026, OpenAI made waves in the tech world by introducing three innovative real-time voice models: GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper. These models have been seamlessly integrated into the Realtime API, promising a revolutionary upgrade to voice interaction experiences.
GPT-Realtime-2 stands out with its GPT-5 level reasoning prowess, enabling it to adeptly handle interruptions and invoke tools on the fly.
GPT-Realtime-Translate, on the other hand, is a linguistic marvel, supporting input in a staggering 70 languages and output in 13, facilitating seamless simultaneous translation.
Lastly, GPT-Realtime-Whisper excels in low-latency streaming transcription, ensuring swift and accurate text conversion.
All three models operate on a pay-per-Token or per-minute billing structure, offering flexibility and cost-efficiency to users.

Previous page：Krisp Launches VIVA 2.0, Reconstructing Voice AI I...

Next page：StepFun Will Complete Financing of Nearly US$2.5 B...

Return to List

Hot Reading

2 day ago

Silicon Valley bets $200M on AI data centers floating in the ocean

1 day ago

AMD posts record first-quarter results, driven by skyrocketing data center CPU demand

1 day ago

Your iPhone will soon give you an important decision to make

1 day ago

Google's Gemma 4 AI models get 3x speed boost by predicting future tokens