On May 8, 2026, OpenAI released three real-time voice models designed for conversational reasoning, real-time translation, and real-time transcription, unlocking a new generation of voice application formats for developers. GPT-Realtime-2 features GPT-5-level reasoning capabilities, supporting complex request processing; GPT-Realtime-Translation supports over 70 input languages and 13 output languages; GPT-Realtime-Whisper enables low-latency speech-to-text conversion.
