Google Ironwood TPU GA, TPU 8 Splits Training & Inference (2026)

Google on April 22, 2026 made its seventh-generation Ironwood TPU generally available to Cloud customers and previewed an eighth-generation architecture split for the first time into two purpose-built chips — the Broadcom-designed TPU 8t "Sunfish" for training and the MediaTek-designed TPU 8i "Zebrafish" for inference. The announcements, made at Google Cloud Next in Las Vegas, represent Google's most aggressive push yet to close the AI-silicon gap with Nvidia.

What Happened

In the opening keynote, CEO Sundar Pichai said "the pace of technological change since last year's Cloud Next has never been faster" and framed Ironwood as the first Google TPU "for the age of inference." The chip, first previewed in November 2025, delivers 4.6 petaFLOPS per chip and scales to 42.5 exaFLOPS in a 9,216-chip superpod, with 1.77 petabytes of shared high-bandwidth memory linked by a 9.6 Tb/s Inter-Chip Interconnect. Amin Vahdat, Google's SVP for AI and Infrastructure, told reporters Ironwood is 4× faster per chip than Trillium (TPU v6e) for both training and inference.

The bigger news was the eighth-generation preview. Rather than continue its long-running "one chip for both jobs" philosophy, Google is splitting the lineup: TPU 8t (Sunfish) is a Broadcom-designed training chip with two compute dies, one I/O chiplet and eight stacks of 12-high HBM3e memory — roughly 30% more memory bandwidth than Ironwood — while TPU 8i (Zebrafish) is a MediaTek-designed single-die inference chip with six HBM3e stacks, targeting a 20–30% lower unit cost. Both use TSMC's 2-nanometre process and both are slated for late 2027.

Google Ironwood TPU — close-up of the chip package on a liquid-cooled board — Google's Ironwood (TPU v7) chip, which entered general availability on April 22, 2026 — the first TPU Google has explicitly positioned as "inference first."

Key Details

Anthropic commits to 1 million TPUs — the Claude maker will take up to one million TPU chips and more than a gigawatt of capacity in 2026, with the first phase covering 400,000 Ironwood units worth an estimated $10 billion in finished racks from Broadcom. The deal extends into 2027 with 3.5 gigawatts of additional compute.
Broadcom's AI contract extends through 2031 — Broadcom designs Ironwood and TPU 8t under an agreement now worth roughly $46 billion over its lifetime. MediaTek, new to the TPU program, handles TPU 8i and has requested a sevenfold increase in TSMC CoWoS packaging capacity to meet the inference ramp.
Performance claims vs. Nvidia — Google said the training-focused TPU 8t delivers 2.8× better price/performance than Ironwood at 121 exaflops per pod, and that the inference-tuned TPU 8i claims 80% better performance per dollar. Internal benchmarks also put Ironwood at a 2.8× energy-efficiency advantage over Nvidia's H100.
Launch customers beyond Anthropic — Lightricks, Essential AI and Salesforce were all cited as Ironwood early adopters. James Bradbury, Anthropic's head of compute, said in the Google Cloud blog post that "Ironwood's improvements in both inference performance and training scalability will help us scale efficiently while maintaining the speed and reliability our customers expect."
Gemini Enterprise Agent Platform — Google also used the keynote to launch a platform for building, scaling and governing AI agents, and disclosed that 75% of new code at Google is now AI-generated, up from 50% in late 2025.

What Developers and Users Are Saying

On Hacker News, the top thread about the announcement split between excitement at Google finally productising the inference gap ("splitting training and inference is the right call — Nvidia is still selling H200s for both") and scepticism about Google's late-2027 TPU 8 timing ("by then Blackwell Ultra and Rubin are in volume — this is a preview, not a ship date"). On r/MachineLearning, commenters highlighted the Anthropic lock-in as the most significant piece of the news: a 1-million-TPU commitment effectively removes Anthropic from the open GPU market for a multi-year window. On X, semiconductor analyst Dylan Patel called the MediaTek partnership "the most interesting part of the announcement" and predicted MediaTek would pick up more Google cost-sensitive work over time.

What This Means for Developers

For teams training or serving models on Google Cloud, Ironwood is available today via Compute Engine, GKE and Vertex AI. Pricing shifts reflect the inference-first positioning — Google is leaning on Ironwood as the default choice for serving LLMs, with the new GKE Inference Gateway claiming up to 96% reduction in time-to-first-token latency and 30% cost savings versus x86 baselines. If you are currently running inference on A100s or H100s on AWS, the price/performance pitch is now hard to ignore — particularly for high-throughput serving workloads. The eighth-generation preview is mostly a roadmap signal: developers should expect Google to widen the per-chip cost advantage on inference as Zebrafish ramps in 2027.

What's Next

Ironwood is live on Google Cloud today. The TPU 8t and TPU 8i chips are slated to enter production in late 2027 on TSMC's 2nm node. Anthropic's first 400,000 Ironwood units are already being racked at Google's Council Bluffs, Iowa and Columbus, Ohio sites. Google also said it will expand its AI Hypercomputer program — bundling Ironwood, Axion CPUs and the Jupiter network — to additional regions through Q3. Google's eighth-generation split is the signal to watch: if Sunfish and Zebrafish hit their targets, it will be the first credible two-chip counter to Nvidia's one-chip-for-everything strategy.

Sources

Sundar Pichai — Cloud Next 2026 keynote recap (blog.google) — primary source, official Google announcement
Ironwood TPUs and new Axion-based VMs (Google Cloud Blog) — technical specs, performance numbers, launch customers
Google launches Ironwood TPU and previews eighth-gen split (The Next Web) — details on Sunfish/Zebrafish architecture and Anthropic commitment
Google splits TPU 8 to chase Nvidia on inference cost (implicator.ai) — economic analysis of the split
Google unveils new TPUs (SiliconANGLE) — keynote coverage
Broadcom vs MediaTek TPU 8 duties (WCCFTech) — supply-chain and packaging analysis