NVIDIA Rubin AI Chip Witnesses Drastic Parameter Hikes: TGP and Bandwidth Surge
2 day ago / Read about 0 minute
Author:小编   

In the fiercely competitive AI chip market, AMD and NVIDIA are locked in an intense rivalry. AMD has recently unveiled that its next-gen Instinct MI450 series AI GPU will harness 2nm process technology, integrating a massive 432GB of HBM4 memory. This GPU boasts a memory bandwidth of 19.6TB/s, horizontal expansion bandwidth of 300GB/s, FP4/FP8 computing power of 40PFLOPS and 20PFLOPS respectively, and AI inference performance that is tenfold that of its predecessor. Forrest Norrod, who leads AMD's data center business, declared that the MI450 will set a new benchmark in both training and inference scenarios, with its performance pitted against NVIDIA's upcoming Rubin architecture.

Concurrently, NVIDIA has introduced the Rubin CPX dedicated GPU. By innovating its hardware architecture to separate context processing from output generation, NVIDIA has achieved a 6.5-fold performance boost in long-context AI inference scenarios. Its Vera Rubin NVL144 server rack delivers a staggering single-unit computing power of 8 EFlops and a memory bandwidth of 1.7PB/s.

When it comes to technical roadmaps, AMD has updated its ROCm 7.0 software stack, enabling the MI355X to achieve a 3.2-fold performance improvement in Llama 3.1 70B model inference. Additionally, AMD has launched a developer cloud platform to minimize ecosystem migration costs. NVIDIA, on the other hand, has leveraged its CUDA ecosystem advantage to enhance token generation efficiency by 30 times through Dynamo software.

In terms of market strategies, AMD has secured a 40% cost advantage with the MI300X in Microsoft Azure inference scenarios. Meanwhile, NVIDIA has emphasized that its solutions can yield revenue returns of up to $5 billion. Both companies are poised to launch their flagship products in 2026, and this competition is set to reshape the landscape of the AI computing power market.