On January 1, 2026, the Zhiyuan Embodied Research Center formally introduced the second - generation integrated embodied cerebellum - cerebrum system, GenieReasoner. In the realm of Vision-Language-Action (VLA) models, there exists a significant challenge: aligning the modalities of semantic reasoning and action control. This system tackles this issue head - on by presenting an innovative model architecture that enables unified discrete pretraining. Moreover, traditional discrete tokenizers often lead to a bottleneck in action precision. GenieReasoner effectively overcomes this problem through the application of flow matching technology.
