Google’s Eighth-Gen TPU, Coupled with 2PB HBM, Successfully Shatters the Memory Bottleneck Holding Back AI Progress
5 hour ago / Read about 0 minute
Author:小编   

Over the past year, memory prices have skyrocketed by three to five times, significantly dampening consumer enthusiasm for purchasing PCs and smartphones. The root cause? The explosive growth in demand for artificial intelligence (AI). AI systems place immense demands on both memory capacity and bandwidth. Take Google’s eighth-generation Tensor Processing Unit (TPU) as a prime example: its training-focused TPU 8t chip is outfitted with 216GB of High Bandwidth Memory (HBM) per chip, delivering a staggering memory bandwidth of 6.5 terabytes per second (TB/s). A single TPU Pod can interconnect a massive 9,600 chips, collectively sharing 2 petabytes (PB) of memory and boasting a computational power of 121 exaflops (when measured in FP4 precision). Meanwhile, its inference-optimized TPU 8i chip is equipped with an even larger 288GB of HBM per chip, offering memory bandwidth of 8.6TB/s and on-chip SRAM of 384 megabytes (MB)—triple that of its predecessor—with a computational capacity of 10.1 petaflops (FP4). Dell’s CEO has even sounded the alarm, predicting that by 2028, the memory requirements for AI accelerators will surge by a staggering 625 times compared to 2023 levels, with the supply-demand gap likely persisting until at least that year.