OpenAI Unveils GPT-Realtime-1.5 Model, Elevating Real-Time Voice Interaction

5 day ago / Read about 0 minute

Author：小编

OpenAI has proudly announced the launch of its latest flagship voice model, the GPT-Realtime-1.5. This cutting-edge model is tailored for voice agents and customer service applications, boasting an 'audio input, audio output' functionality. It accommodates a diverse range of inputs, including text, audio, and images, while delivering outputs in both text and audio formats. Impressively, it features a 32,000-token context window and can generate outputs of up to 4,096 tokens. GPT-Realtime-1.5 excels in real-time conversations, voice transcription, and multimodal interactions, and has been seamlessly integrated into the Realtime API endpoint. Regarding pricing, the cost for audio input is set at $32 per million tokens, with audio output priced at $64. For text inputs, the cost is $4 per million tokens, while text outputs are priced at $16. Presently, the model is accessible to qualified developers exclusively through the OpenAI API.

Previous page：OpenAI COO: We Haven't Really Seen AI Penetrate Bu...

Next page：Tavus Unveils Phoenix-4: Pioneering the World’s Fi...

Return to List

Hot Reading

1 day ago

Honor says its ‘Robot phone’ with moving camera can dance to music

2 day ago

Want the Most From Your Kindle? Try Out My Go-To Hacks

2 day ago

And the award for the most improved EV goes to... the 2026 Toyota bZ

2 day ago

Apple says it has "a big week ahead." Here's what we expect to see.