According to AI Base, ByteDance has teamed up with Nanyang Technological University to launch an open-source AI video generation framework named StoryMem. This innovative framework transforms existing single-shot diffusion models into a seamless long-form video generation system. It accomplishes this by supporting multi-shot sequences that extend beyond 1 minute in length, thanks to its 'Memory-to-Video (M2V)' mechanism. StoryMem employs a dynamic memory bank to preserve keyframe information. When paired with the lightweight LoRA fine-tuning technology, it ensures a high degree of cross-shot consistency in character appearance, scene style, and narrative logic. Compared to existing methods, StoryMem has boosted consistency metrics by 29%. Moreover, the framework has introduced the ST-Bench dataset, which includes 300 multi-shot story prompts, to facilitate standardized evaluation. At present, the tech community has started incorporating this cutting-edge technology into ComfyUI.
