ModelBest Unveils 9B-Parameter Model Trained via SALA-Based Emotion Modeling, Featuring Sparse-Linear Hybrid Architecture
2 day ago / Read about 0 minute
Author:小编   

On February 12, ModelBest introduced the Sparse-Linear Attention Hybrid Architecture (SALA) and unveiled MiniCPM-SALA, a 9B-parameter text model built upon this innovative architecture. Notably, MiniCPM-SALA eschews acceleration techniques like speculative sampling. When deployed on cloud inference chips and processing sequences of 256K tokens, it achieves an inference speed 3.5 times faster than Qwen3-8B. Moreover, the model supports context inference for sequences extending up to one million tokens, whether running on cloud chips or consumer-grade GPUs at the endpoint.