OpenAI designed its first chip in nine months — with help from its own AI. Broadcom manufactures it. TSMC fabricates it. It targets inference only, not training. Early testing shows 50% cost savings versus GPU. Deployment starts late 2026. NVIDIA barely moved. The chip industry just got more interesting.
OpenAI’s Jalapeño custom AI chip arrived on 24 June 2026 — and the speed of its development is the first remarkable thing about it. OpenAI President Greg Brockman told CNBC’s David Faber that the chips were designed from end to end in nine months with help from the company’s AI models. “The degree to which our models have been able to accelerate it was very surprising to us,” Brockman said. Broadcom and OpenAI unveiled the chip together on Wednesday morning. The chip is an application-specific integrated circuit (ASIC) built exclusively for inference — the process of running trained AI models in response to user prompts. It is not designed for training. Broadcom CEO Hock Tan said that early samples are showing cost savings of roughly 50% compared to typical AI GPUs. A physical chip sample reached OpenAI on 24 June. Deployment targets late 2026. The next generation targets 2028.
What’s Happening & Why It Matters
OpenAI’s $20.92 Billion Operating Loss
OpenAI’s Jalapeño custom AI chip is not an engineering project. It is a financial necessity. OpenAI generated $13.07 billion in revenue in 2025 but spent $34 billion, posting an operating loss of nearly $20.92 billion. Research and development costs driven by compute infrastructure consumed $19.18 billion, roughly 56% of total spending. Every ChatGPT response, every Codex suggestion, every API call runs on NVIDIA GPUs purchased at NVIDIA‘s prices. Furthermore, eight sources told Reuters that OpenAI has been unhappy with some of Nvidia’s newest chips and has looked for other options since last year. Seven sources said the company is concerned with how fast Nvidia’s hardware answers users on certain tasks, such as writing software code.
Jalapeño addresses that cost directly. An ASIC designed specifically for large language model inference — and nothing else — can outperform a general-purpose GPU on that specific workload. At the scale OpenAI operates, a 50% cost saving on inference compute changes the company’s unit economics fundamentally.
Strategic Inference-Only
OpenAI’s Jalapeño custom AI chip targets inference exclusively. That distinction requires explanation. Training is the process of building an AI model from scratch — computationally brutal, running for weeks or months on clusters of thousands of GPUs. Inference is what happens after: every time a user asks ChatGPT a question, the trained model generates an answer. Inference is OpenAI‘s highest-volume, most cost-sensitive workload. By contrast, training is on NVIDIA hardware. OpenAI’s $30 billion Nvidia investment from February 2026, along with an agreement to deploy 10 gigawatts of Nvidia’s Vera Rubin platform, keeps the GPU giant central to the company’s training pipeline.
Altman acknowledged the dynamic publicly. “We love working with NVIDIA and they make the best AI chips in the world. We hope to be a gigantic customer for a very long time.” Additionally, Broadcom CEO Hock Tan said “At the end of the day, you cannot, should not rely on some other third-party GPU to do it for you, because it’s such a key part.” Both statements are simultaneously true. Jalapeño reduces inference dependency on NVIDIA. It does not replace NVIDIA in training. The strategic intent is to compete on cost where OpenAI has the most leverage — without provoking a conflict it cannot win at the training layer.
The Jalapeño Architecture — ASIC, Tomahawk, TSMC, Celestica
OpenAI did the core design. Broadcom brought specific knowledge in connectivity and other areas, plus its Tomahawk switching silicon. TSMC, the world’s dominant chip manufacturer, is handling the actual fabrication. Canadian electronics manufacturer Celestica will build the server systems.
The ASIC architecture is both an advantage and a limitation. An application-specific integrated circuit is optimised for one task — in this case, LLM inference. It outperforms a GPU on that task because it wastes no transistors on general-purpose flexibility. By contrast, it cannot be reprogrammed for other workloads without a new chip design cycle. Broadcom CEO Hock Tan placed Jalapeño on par with Nvidia’s Blackwell line and Google’s tensor processing units — two of the most powerful AI accelerators currently in use. That is a vendor claim, not an independent benchmark. The independent validation arrives with the first production deployment later in 2026.

The Competition: Google, Amazon, Microsoft, Meta
OpenAI’s Jalapeño custom AI chip arrives late. OpenAI is late to a party its biggest rivals threw years ago. Google has its TPUs, Amazon its Trainium and Graviton lines, and Microsoft its Maia accelerators. Each pairs custom silicon with Nvidia chips rather than replacing them outright. Meta’s MTIA lineup powers its recommendation engines and generative AI features. Every hyperscaler with its own data centre infrastructure has already been through the custom silicon transition.
OpenAI is the anomaly — a frontier AI company that does not own its own data centres. Unlike Google or Amazon, OpenAI doesn’t own its data centers — which makes the silicon bet bolder, not safer. By contrast, Broadcom is at the centre of multiple hyperscaler chip programmes simultaneously — Google’s TPUs, Jalapeño, and others. Broadcom quietly became the kingmaker of the post-Nvidia chip scramble, supplying the connectivity and manufacturing muscle that the AI labs lack.
The IPO: Shipping Jalapeño Before the S-1
The announcement timing aligns precisely with OpenAI‘s IPO preparation. A company heading toward a public offering at approximately $1 trillion valuation — while posting $20.92 billion operating losses — needs to demonstrate a credible path to profitability. A chip that cuts inference costs by 50% is that path made tangible. Jalapeño provides OpenAI with the opportunity to match and offset the hyperscaler advantage. By baking its software architecture directly into a proprietary processor, OpenAI has the chance to replicate, at least in part, the playbook used by Google, Amazon, Microsoft, and Meta — transitioning from a captive cloud customer into a more independent AI infrastructure provider.
TF Summary: What’s Next
First commercial deployments of Jalapeño at Microsoft data centres and other partners are targeted for late 2026. Real volume arrives in 2027. OpenAI wants its custom chips powering 10 gigawatts of compute by 2029 — roughly the output of ten nuclear reactors. The next generation targets 2028, with annual releases thereafter. An independent technical validation of the 50% cost saving claim is expected to accompany the first production deployment announcement.
MY FORECAST: OpenAI’s Jalapeño custom AI chip will deliver on the 50% inference cost saving — in the specific workloads it was designed for. By contrast, it will not reduce OpenAI‘s overall compute costs by 50%. Inference is the high-volume, cost-sensitive workload. Training is the high-capex, capability-determining workload. Jalapeño gives OpenAI the same inference economics Google and Amazon have operated at for years. It does not give OpenAI an advantage over those companies — it removes a disadvantage. That is the correct first step. By 2028, when the second-generation chip arrives, OpenAI will know whether the ASIC bet is large enough to change the company’s fundamental economics — or whether it is one cost-reduction measure in a company that needs many more.
Related Stories
- OpenAI Files Confidential IPO — the Race to List Is On
- AI Stocks Sell Off as Markets Ask Whether the Spending Will Ever Pay Off
- Microsoft Build 2026: MAI Models, Project Solara, and the AI Badge

