OpenAI Develops Bidirectional Speech Model: Responds Instantly Even When Interrupted, Making Conversations More Natural and Smooth
10 hour ago / Read about 0 minute
Author:小编   

According to media reports, OpenAI is developing a new speech model with a core breakthrough in enabling real-time dialogue adjustment—when users interrupt the AI's speech, the model can immediately adjust its response based on the new input without breaking the conversation. Existing speech models produce fixed responses that cannot be dynamically revised based on mid-conversation interruptions, whereas the new model supports natural interaction by continuously processing speech input streams, allowing simultaneous listening and speaking. The technology is still in development, with prototype models prone to malfunctions after several minutes of sustained conversation. Its release has been delayed from the originally planned Q1 2026 to Q2 or later. OpenAI believes that if speech models achieve performance comparable to text models, it will significantly expand AI application scenarios by lowering usage barriers through more human-like voice interactions. From an application perspective, the model holds particular value in customer service—for instance, when users suddenly change their requests, the AI can seamlessly adapt conversation logic to avoid service disruptions or confusion. Additionally, this technology may provide foundational support for OpenAI's planned voice-interactive AI devices and smart speaker products, enabling users to complete tasks like checking emails or booking services via voice commands.