Tavus has recently launched Phoenix-4, heralding it as the world’s inaugural real-time emotional portrait rendering model. This groundbreaking model has the capability to convert voice input and conversational context into seamless facial animations. These animations are rich with emotional responses, active listening cues, and nuanced full-face micro-expressions. Remarkably, it operates at a swift 40 frames per second (fps) in 1080p resolution, ensuring smooth and lifelike visuals. Constructed upon a diffusion architecture and leveraging 3D Gaussian splatting rendering technology, Phoenix-4 grants users real-time control over intricate details such as head posture and eye gaze. Additionally, it facilitates effortless transitions among more than 10 distinct emotional states.
Its emotional system can be adeptly guided by Large Language Model (LLM) instructions or synergized with the perception model, Raven-1, to achieve expressions that adapt seamlessly to the surrounding context. Phoenix-4 has been seamlessly integrated into the Tavus platform, empowering developers with access to APIs and bespoke digital human services. This integration is aimed at delivering an AI interaction experience that exudes a profound “sense of genuine presence,” catering to a wide array of scenarios including healthcare, education, customer service, and beyond.
