Google tells employees it must double capacity every 6 months to meet AI demand
17 hour ago / Read about 12 minute
Source:ArsTechnica
Google's AI infrastructure chief tells staff it needs thousandfold capacity increase in 5 years.


Credit: Google

While AI bubble talk fills the air these days, with fears of overinvestment that could pop at any time, something of a contradiction is brewing on the ground: Companies like Google and OpenAI can barely build infrastructure fast enough to fill their AI needs.

During an all-hands meeting earlier this month, Google’s AI infrastructure head Amin Vahdat told employees that the company must double its serving capacity every six months to meet demand for artificial intelligence services, reports CNBC. Vahdat, a vice president at Google Cloud, presented slides showing the company needs to scale “the next 1000x in 4-5 years.”

While a thousandfold increase in compute capacity sounds ambitious by itself, Vahdat noted some key constraints: Google needs to be able to deliver this increase in capability, compute, and storage networking “for essentially the same cost and increasingly, the same power, the same energy level,” he told employees during the meeting. “It won’t be easy but through collaboration and co-design, we’re going to get there.”

It’s unclear how much of this “demand” Google mentioned represents organic user interest in AI capabilities versus the company integrating AI features into existing services like Search, Gmail, and Workspace. But whether users are using the features voluntarily or not, Google isn’t the only tech company struggling to keep up with a growing user base of customers using AI services.

Major tech companies are in a race to build out data centers. Google competitor OpenAI is planning to build six massive data centers across the US through its Stargate partnership project with SoftBank and Oracle, committing over $400 billion in the next three years to reach nearly 7 gigawatts of capacity. The company faces similar constraints serving its 800 million weekly ChatGPT users, with even paid subscribers regularly hitting usage limits for features like video synthesis and simulated reasoning models.

“The competition in AI infrastructure is the most critical and also the most expensive part of the AI race,” Vahdat said at the meeting, according to CNBC’s viewing of the presentation. The infrastructure executive explained that Google’s challenge goes beyond simply outspending competitors. “We’re going to spend a lot,” he said, but noted the real objective is building infrastructure that is “more reliable, more performant and more scalable than what’s available anywhere else.”

The thousandfold scaling challenge

One major bottleneck for meeting AI demand has been Nvidia’s lack of capacity to produce enough GPUs that accelerate AI computations. Just a few days ago during a quarterly earnings report, Nvidia said its AI chips are “sold out” as it races to meet demand that grew its data center revenue by $10 billion in a single quarter.

The lack of chips and other infrastructure constraints affects Google’s ability to deploy new AI features. During the all-hands meeting on November 6, Google CEO Sundar Pichai cited the example of Veo, Google’s video generation tool that received an upgrade last month. “When Veo launched, how exciting it was,” Pichai said. “If we could’ve given it to more people in the Gemini app, I think we would have gotten more users but we just couldn’t because we are at a compute constraint.”

At the same meeting, Vahdat’s presentation outlined how Google plans to achieve its massive scaling targets without simply throwing money at the problem. The company plans to rely on three main strategies: building physical infrastructure, developing more efficient AI models, and designing custom silicon chips.

Using its own chips means Google does not need to completely rely on Nvidia hardware to build out its AI capabilities. Earlier this month, for example, Google announced the general availability of its seventh-generation Tensor Processing Unit (TPU) called Ironwood. Google claims it is “nearly 30x more power efficient” than its first Cloud TPU from 2018.

Given widespread acknowledgment of a potential AI industry bubble, including extended remarks by Pichai in a recent BBC interview, the aggressive plans for AI data center expansion reflect Google’s calculation that the risk of underinvesting exceeds the risk of overcapacity. But it’s a bet that could prove costly if demand doesn’t continue to increase as expected.

At the all-hands meeting, Pichai told employees that 2026 will be “intense,” citing both AI competition and pressure to meet cloud and compute demand. Pichai directly addressed employee concerns about a potential AI bubble, acknowledging the topic has been “definitely in the zeitgeist.”