Zhipu has proudly announced the official release of its flagship GLM-4.6 text model. As the newest gem in the GLM series, this model stands out for its remarkable coding capabilities, boasting a 27% enhancement over its predecessor, GLM-4.5. Sporting a staggering 355 billion parameters, with 32 billion actively engaged, GLM-4.6 surpasses all previous iterations in terms of core functionalities. Its advanced coding skills now rival those of Claude Sonnet 4, and its context length capacity has been expanded to an impressive 200K.
Currently accessible via Zhipu's MaaS platform, the model is poised for an imminent open-source release on both Hugging Face and ModelScope. In rigorous evaluations, GLM-4.6 has demonstrated performance on par with Claude Sonnet 4 and even outperformed its own 4.5 version on certain benchmarks, solidifying its status as the premier domestic model. Practical programming tests further reveal that its performance not only surpasses that of Claude Sonnet 4 but also outshines other domestic models, all while reducing average token consumption by over 30% compared to GLM-4.5. Zhipu has made all test questions and Agent trajectories publicly available.
Moreover, GLM-4.6 has successfully achieved FP8+Int4 mixed-precision quantization deployment on Cambrian's domestically produced chips. It can also run stably on Moore Threads' latest-generation GPUs, leveraging the vLLM inference framework. This combined service offering will be made available through Zhipu's MaaS platform.