DeepSeek's Latest Model, MODEL1, Unveiled: Code Reveals Potential New Architecture

2026-01-21 / Read about 0 minute

Author：小编

On January 21st, coinciding with the first anniversary of DeepSeek-R1's launch, details of DeepSeek's new model, MODEL1, emerged. DeepSeek made updates to the FlashMLA code repository on GitHub, with the term "MODEL1" appearing 28 times across 114 different files. This newly revealed model sets itself apart from V32. For context, V32 is recognized as DeepSeek-V3.2, suggesting that MODEL1 might signify a groundbreaking new architectural design. The code discrepancies are mainly evident in aspects such as the KV cache layout, sparsity management, and FP8 decoding processes. Furthermore, there are numerous differences in memory optimization techniques, hinting at significant advancements.

Previous page：G42 CEO in UAE: Cutting-Edge AI Chips from NVIDIA,...

Next page：Another Co-founder Bids Farewell: Yang Ge Steps Do...

Return to List

Hot Reading

2 day ago

Major SteamOS update adds support for Steam Machine, even more third-party hardware

2 day ago

I Tested a Low-Cost Phone With One of the Biggest Batteries We've Seen Yet

2 day ago

Microsoft keeps insisting that it's deeply committed to the quality of Windows 11

2 day ago

Feds say no need to recall Tesla's one-pedal driving despite petition

2 day ago

AITEX Summit Winter 2026 Hackathon Delivers Practical AI and Data Analytics Solutions

2 day ago

Why Wall Street wasn’t won over by Nvidia’s big conference

1 day ago

An exclusive tour of Amazon’s Trainium lab, the chip that’s won over Anthropic, OpenAI, even Apple

1 day ago

Are AI tokens the new signing bonus or just a cost of doing business?

2 day ago

Microsoft rolls back some of its Copilot AI bloat on Windows

2 day ago

10 Reasons Why Exaforce Is One of the Most Important Companies in Cybersecurity

Previous page：G42 CEO in UAE: Cutting-Edge AI Chips from NVIDIA,...

Next page：Another Co-founder Bids Farewell: Yang Ge Steps Do...