On February 2, 2026, StepOn AI proudly introduced its latest open-source Agent base model, the Step 3.5 Flash. This innovative model swiftly climbed to the top of the trending list on the OpenRouter platform within just 48 hours after its debut. Leveraging a sparse Mixture of Experts (MoE) architecture, the model boasts an impressive total of 196 billion parameters. However, it cleverly activates only around 11 billion parameters for each token, striking a perfect balance between inference speed and cost. This strategic design enables it to achieve a remarkable maximum inference speed of 350 tokens per second. Presently, a number of prominent chip manufacturers, such as Huawei Ascend, Moore Threads, Biren Technology, Enflame Technology, Iluvatar CoreX, and Alibaba T-Head, have already finished adapting the model for their respective platforms.
