On April 24, 2026, DeepSeek V4-Pro and DeepSeek V4-Flash were officially launched and made available as open-source solutions. These new iterations have significantly extended the model's context processing length to 1M, nearly a tenfold increase in capacity. For the first time, the implementation of KV Cache sliding windows and compression algorithms has been introduced, effectively reducing the computational overhead associated with Attention mechanisms and memory access. This advancement provides robust support for Agent and Coding applications. Through collaborative chip-model technology, Huawei Ascend ensures comprehensive support for DeepSeek V4 across its entire SuperNode series of products. The Ascend 950 leverages fused kernels and multi-stream parallelism, coupled with quantization algorithms, to facilitate high-throughput and low-latency deployment of model inference. Meanwhile, the Ascend A3 SuperNode series products are fully compatible and offer training reference implementations, streamlining the process for users to achieve rapid fine-tuning.
