NVIDIA Sets New Benchmark in Full-Scale DeepSeek Inference Performance
2025-03-19 / Read about 0 minute
Author:小编   

During the NVIDIA GTC 2025 conference, NVIDIA proudly announced that its DGX system, armed with eight state-of-the-art Blackwell GPUs, has achieved a groundbreaking world record in inference performance for the DeepSeek-R1 model, which boasts an impressive 671 billion parameters. This system delivers an astonishing throughput of over 250 tokens per user per second, culminating in a peak performance of 30,000 tokens per second. NVIDIA emphasized that the synergistic pairing of Blackwell GPUs and the TensorRT software has significantly enhanced inference capabilities, hinting at even further performance enhancements in the future.