NVIDIA Unveils Groq 3 LPU Inference Chip
4 hour ago / Read about 0 minute
Author:小编   

The Rubin platform has now seamlessly integrated the cutting-edge NVIDIA Groq 3 LPU chip, a specialized component crafted explicitly for accelerating inference processes. This integration dramatically bolsters the system's capacity to handle and dispatch tokens with minimal latency and in sizable quantities, thereby fostering a high degree of interactivity within the realm of artificial intelligence models. In light of this advancement, NVIDIA has laid out plans to introduce the Groq 3 LPX rack, which will house an impressive 256 Groq 3 LPUs. Ian Buck, NVIDIA's Vice President of Hyperscale Business, highlighted that the Groq LPX will function as a coprocessor for the Rubin platform, fine-tuning the decoding performance across "every layer of the AI model for each token." This strategic move is poised to empower the Rubin platform to underpin the forthcoming phase of AI evolution: multi-agent systems. Such systems are required to sustain interactive capabilities within a context window encompassing millions of tokens, all while processing models with trillions of parameters.