NVIDIA has declared the imminent release of its latest Rubin CPX graphics processor. This processor is tailored to tackle intricate tasks like video generation and software development, and it is anticipated to hit the market by the end of 2026. Presented in card - format, the product can be seamlessly integrated into existing servers or function as an independent computer. Its primary goal is to elevate the efficiency of AI inference computing, especially for applications that demand ultra - long context windows, such as programming and video creation.
NVIDIA's CEO, Jensen Huang, emphasized that Rubin CPX is the inaugural chip specifically crafted for models that process millions of tokens simultaneously during AI inference. By segregating the computational workloads of the context phase and the generation phase, Rubin CPX is set to substantially enhance compute utilization.
When outfitted with Rubin CPX, the Rubin rack can achieve up to 6.5 times the performance of current flagship racks when dealing with large context windows. It offers a remarkable 8 exaFLOPs of NVFP4 computing power, accompanied by 100TB of high - speed memory and a memory bandwidth of 1.7PB/s.
NVIDIA has plans to offer Rubin CPX in two configurations. One option is to integrate it with Vera Rubin on the same tray. For users who have already placed orders for NVL144, NVIDIA will also make available a full rack of CPX chips for separate purchase.
