Tencent Hunyuan has made an official announcement, declaring the launch and open-sourcing of the Hunyuan World Model (WorldMirror) Version 1.1. This cutting-edge model is designed to handle multi-view and video inputs seamlessly, enabling deployment on a single GPU. In a matter of seconds, it can generate intricate 3D worlds, marking a significant stride in making professional 3D reconstruction technology more accessible.
As a unified feedforward 3D reconstruction large model, it overcomes the constraints of its predecessor, Version 1.0, which was limited to text or single-image inputs. For the first time, this model realizes end-to-end 3D reconstruction by integrating multi-modal prior information and providing unified multi-task outputs. It is capable of accepting prior inputs, such as camera parameters and depth data, and generating a diverse array of 3D geometric results, including point clouds and depth maps. Its performance surpasses that of existing methods in the field.
With the model now open-source, developers have the freedom to deploy it through GitHub. Meanwhile, everyday users can explore its capabilities via an online demo available on HuggingFace.
