NVIDIA has made a groundbreaking announcement, introducing the Rubin CPX dedicated GPU. This GPU is specifically engineered for large - scale context processing. Its primary goal is to dramatically enhance the efficiency of AI inference computing, effectively doubling it. This makes it an ideal choice for applications that demand ultra - long context windows, including programming tasks and video generation projects.
In the same breath, NVIDIA has also revealed details about its next - generation AI server, the 'Vera Rubin NVL144'. This server stands as a flagship product in the realm of AI training and inference. Each rack of this server is meticulously equipped with 36 Vera CPUs and 144 Rubin GPUs. To further elevate its performance, it is paired with ultra - high bandwidth 1.4PB/s HBM4 memory and a massive 75TB of storage. This combination results in a substantial leap forward in terms of both performance and scale, marking a new era in AI server technology.
