NVIDIA Set to Release Innovative LPU Chip in China: No Compromises, No Special Versions, Delivering a Staggering 40PB/s Performance
3 day ago / Read about 0 minute
Author:小编   

At the dawn of March 17, 2026, during NVIDIA's GTC conference, the tech giant officially announced the launch of the Groq 3 LPU chip, a cutting-edge solution meticulously crafted for AI inference tasks. This unveiling marks a significant milestone, representing the first major breakthrough since NVIDIA's strategic acquisition of Groq's core technology assets for an estimated $20 billion in the previous year.

As a dedicated language processing unit, the Groq 3 LPU is engineered to optimize the inference efficiency of language models through groundbreaking architectural innovations. It is particularly tailored for scenarios demanding low-latency decoding and interactive inference, setting a new benchmark in the field.

In contrast to GPUs, which predominantly cater to training and general-purpose computing needs, the Groq 3 LPU stands out by integrating 500MB of on-chip SRAM, boasting a memory bandwidth of up to an astonishing 150TB/s. This figure dwarfs the 22TB/s offered by HBM4, resulting in a substantial boost in the efficiency of AI decoding operations.

The chip introduces a revolutionary "Dynamo" heterogeneous inference architecture, designed to seamlessly collaborate with Rubin GPUs. This synergy allows for a meticulous division of inference tasks, with the Rubin GPU shouldering the responsibility for complex Prefill and Attention calculations, while the LPU takes charge of low-latency Token decoding. This architectural advancement heralds the dawn of a new era in dedicated hardware collaboration for AI inference, opening doors to applications with stringent real-time requirements.

Scheduled to commence shipping in the latter half of 2026, the Groq 3 LPU will be seamlessly integrated into the Vera Rubin platform. Together, they will provide comprehensive support for the entire AI workflow, from initial training to final deployment, ushering in a new era of AI innovation.

  • C114 Communication Network
  • Communication Home