On April 2, 2026, Zhipu proudly announced the launch of its inaugural native multimodal coding foundation model, GLM-5V-Turbo. This groundbreaking model seamlessly blends visual and programming prowess, empowering it to natively handle multimodal information encompassing text, images, and videos. It shines in tackling intricate tasks, including programming, long-range strategic planning, and operational execution. In core benchmark tests tailored for multimodal coding and agents, the model showcases its superior performance. It not only upholds exemplary standards in text-centric programming and reasoning but also incorporates cutting-edge visual capabilities. Furthermore, the model has endowed OpenClaw Lobster with visual acuity, enabling it to comprehend and interpret screen information with precision. At present, GLM-5V-Turbo is readily accessible via Zhipu's MaaS platform, inviting developers and enthusiasts to explore its vast potential.
