The LongCat team at Meituan has officially rolled out and made open-source the state-of-the-art (SOTA) virtual human video generation model, LongCat-Video-Avatar. Constructed on the foundation of the LongCat-Video model, it embraces the design tenet of 'one model, multiple tasks,' inherently enabling capabilities like Audio-Text-to-Video (AT2V), Audio-Text-Image-to-Video (ATI2V), and video extension. Furthermore, the model has witnessed a holistic enhancement in its core architecture, leading to remarkable advancements in three crucial domains: motion authenticity, long-video steadiness, and identity preservation.
