On the first anniversary of DeepSeek-R1's release, its new model "MODEL1" was exposed in the GitHub code repository. This model appears 28 times across 114 files in the FlashMLA optimization library, cited either alongside or distinctly from the existing model V3.2. Technical analysis reveals that MODEL1 adopts a completely new architecture, optimized in areas such as key-value cache layout, sparsity handling, and FP8 decoding. It may serve as the development codename for DeepSeek's next-generation flagship model V4, with an expected release as early as February.
