The Apple AI research team has published a report announcing the development of its LiTo model, which has successfully tackled a fundamental challenge in the realm of 3D reconstruction. This innovative model can now reconstruct complete 3D objects from a mere single planar image. It transcends the constraints of conventional techniques that necessitate multi-angle image inputs, delivering reconstructed 3D objects with exceptionally realistic and consistent lighting and shadow effects across various viewpoints. The model's core breakthrough is the utilization of 'latent space,' introducing a unified 3D latent representation approach. By employing a bidirectional mechanism—where an encoder compresses information and a decoder decompresses it—the model accurately replicates sophisticated lighting and shadow effects. Researchers subjected the model to rigorous training using thousands of 3D objects, empowering it to predict three-dimensional latent representations from a single image. This achievement significantly outperforms the existing TRELLIS model in terms of accuracy in restoring multi-view lighting and shadows.
