On May 29, Jieyue Xingchen unveiled and open-sourced its Step 3.7 Flash model—a model meticulously crafted for production-level agents. This model boasts a sparse Mixture of Experts (MoE) architecture, encompassing a total of over 196 billion plus 1.8 billion (ViT) parameters, with 11 billion active parameters. It is capable of generating content at a blistering pace of up to 400 Tokens per second. Leveraging the DTK heterogeneous computing platform and a comprehensive full-stack software stack, the Hygon team swiftly completed end-to-end adaptation and in-depth optimization on the very day of the model's release. This achievement exemplifies the seamless transition from 'release to adaptation' and from 'adaptation to efficiency'. Now, developers can harness the power of this model on the Hygon platform, enabling them to swiftly construct multimodal agents, intelligent code assistants, and intricate workflow applications. This advancement substantially cuts down on both access and orchestration costs, paving the way for more streamlined and cost-effective development processes.
