On April 24, DeepSeek unveiled its latest-generation AI large model, DeepSeek-V4, and promptly made it open-source. This cutting-edge model boasts an ultra-long context capacity of up to one million words, setting a new benchmark in Agent capabilities, world knowledge, and reasoning prowess. V4 is available in two variants: DeepSeek-V4-Pro and DeepSeek-V4-Flash, both of which support a 1M context length. The Agent capabilities of V4-Pro have undergone substantial enhancements, while V4-Flash prioritizes speed and efficiency. V4 also introduces an innovative attention mechanism that significantly reduces computational and memory demands. Furthermore, V4 has successfully validated a fine-grained EP solution on both Huawei's Ascend and NVIDIA platforms. It is anticipated that following the widespread deployment of Ascend 950 super nodes in the latter half of the year, the cost of V4-Pro will experience a notable decrease. The primary reason for the delay in V4's release stemmed from its evolution from single-model reinforcement to complex system construction, with a focus on surmounting formidable technological hurdles such as multimodality and long-term memory, as well as achieving deep compatibility with domestic chips.
