JD.com Makes JoyAI-Echo Long Audio and Video Generation Framework Open-Source
3 hour ago / Read about 0 minute
Author:小编   

On June 3, 2026, JD.com unveiled the JoyAI-Echo long audio and video generation framework, which effectively tackles three significant hurdles in long video production: inconsistencies in character depiction, unpredictable voice fluctuations, and sluggish generation speeds. This innovative framework incorporates an intelligent 'Director Assistant' (Director Agent) equipped with an internal memory repository. This repository enables the continuous storage and retrieval of character appearance traits and speaker voice details throughout the multi-shot generation process. Furthermore, the framework offers 'conversational editing' capabilities, with all its code and weights being made entirely open-source for the benefit of the wider community.