OpenAI's Custom Jalapeño Chip: 50% Inference Cost Cut, 9 Months from Design to Tapeout

Tech Trends2026-07-04 10:30:00

🤖 This article was generated by AI. Content is for informational purposes only.

OpenAI Built Its Own Chip: Going After NVIDIA, Cutting Inference Costs in Half

On June 24, OpenAI unveiled its first custom AI chip — Jalapeño.

Not a PowerPoint announcement — actual silicon that's already taped out and running GPT-5.3 in the lab. Nine months from design to tapeout. Broadcom handled silicon implementation, TSMC manufactured it on a 3nm process.

This chip may be more significant than GPT-5.6.

What Is Jalapeño?

In short: an ASIC chip purpose-built for LLM inference. Unlike NVIDIA's general-purpose GPUs, it's deeply optimized for LLM inference workloads — reducing data movement, balancing compute and memory, pushing real-world utilization closer to theoretical peak.

Broadcom CEO Hock Tan revealed that early testing shows roughly 50% inference cost savings compared to current mainstream GPUs.

For OpenAI, processing hundreds of millions of API calls daily, this is existential. Halving inference costs means either doubling margins or slashing prices to grab market share — both terrible news for competitors.

How Did They Build It in 9 Months?

Normal ASIC design takes 1.5-2 years. Jalapeño took 9 months. OpenAI's hardware chief Richard Ho credits deep software-hardware co-development — using their frontier models to assist with architecture exploration and power simulation, while Broadcom handled silicon implementation.

In plain terms: "AI helping humans build AI chips." Greg Brockman's words: "We're using the frontier models that serve users to optimize the infrastructure that will run future models."

An interesting detail: the codename comes from a Mexican chili pepper. OpenAI skipped the grandiose naming and went straight to the kitchen. Maybe they wanted to say — we build chips as fast as we cook.

What This Means for the Industry

One, OpenAI has officially joined the "de-NVIDIA" camp. Google has TPUs, Microsoft has Maia, Amazon has Trainium, now OpenAI has its own silicon. Full-stack competition — models + chips + systems + networking — is the new normal.

But OpenAI isn't abandoning NVIDIA. Brockman put it bluntly: "We simply can't get enough compute fast enough." Jalapeño is a structural supplement to explosive compute demand, not a replacement. In the short term, NVIDIA's position won't budge. In the long run, customers are being diverted.

Two, chip delivery and deployment is the real test. Jalapeño targets initial deployment by end of 2026, with Microsoft and other partners deploying at gigawatt-scale data centers. But the gap between engineering samples and stable mass production could be bumpy.

Three, Broadcom is the stealth winner. Jalapeño uses Broadcom's Tomahawk networking chips and high-speed interconnects — these will monetize at scale alongside the Jalapeño platform. Broadcom has quietly become the key arms dealer in the AI chip war.

If you're an AI application developer, pay attention to this trend — declining chip costs mean API prices will keep dropping. Cheaper inference opens up more use cases.