At the recent I/O Developer Conference, Google announced that it will upgrade a series of AI creation tools through the Gemini model family to lower the barrier to multimedia content generation and improve efficiency. In the field of video and multimodal creation, Google introduced the new Gemini Omni model, which supports text, image, audio, and video inputs and can generate coherent video content. The most notable feature of this model is its support for conversational editing, where users can simply describe their modification needs in natural language, such as changing characters, adjusting lighting, or altering scenes, and the model will automatically complete the editing.
