The SuperCLUE team has unveiled the assessment outcomes for the DeepSeek V4 series of Chinese large language models. Notably, the DeepSeek-V4-Pro emerges as the overall leader in China, securing the first rank, with the Flash variant following closely in second place. This comprehensive evaluation spans six key dimensions, with the Pro model achieving a score of 70.98 points and the Flash model scoring 68.82 points. Both versions substantially outperform other domestic models. The series boasts an innovative attention mechanism, and all iterations support extended contexts of up to one million tokens, efficiently minimizing computational resource and memory consumption. In comparison to its predecessor, the V3.2, both versions exhibit marked enhancements across the board. However, when pitted against leading international models, DeepSeek V4 still exhibits certain shortcomings in areas such as code generation and the execution of complex instructions. Nevertheless, with its well-rounded capabilities, DeepSeek V4 has solidified its position among the elite in China.
