Two stories from the past few weeks reveal an AI ecosystem at war with itself. The chips are winning. The economics might not.
14 min read1 day ago
–
Press enter or click to view image in full size
The Setup: What Just Happened
On Christmas Eve 2025, while most of the world switched off, Nvidia did the opposite.
Multiple reports say Nvidia agreed to a deal worth around $20 billion to license technology from Groq, an AI chip startup, and hire its founder Jonathan Ross along with president Sunny Madra. Not acquire — license. On paper, Groq remains an independent company under a new CEO. In practice, Nvidia now controls its most interesting ideas and its key people.
At almost the same time, a different kind of shock hit the AI world.
Ilya Sutskever, OpenAI’s …
Two stories from the past few weeks reveal an AI ecosystem at war with itself. The chips are winning. The economics might not.
14 min read1 day ago
–
Press enter or click to view image in full size
The Setup: What Just Happened
On Christmas Eve 2025, while most of the world switched off, Nvidia did the opposite.
Multiple reports say Nvidia agreed to a deal worth around $20 billion to license technology from Groq, an AI chip startup, and hire its founder Jonathan Ross along with president Sunny Madra. Not acquire — license. On paper, Groq remains an independent company under a new CEO. In practice, Nvidia now controls its most interesting ideas and its key people.
At almost the same time, a different kind of shock hit the AI world.
Ilya Sutskever, OpenAI’s co-founder and former chief scientist, declared that “pre-training as we know it will unquestionably end” because we are running out of high-quality data. Geoffrey Hinton and Yann LeCun, two of the three “Godfathers of AI,” have become increasingly blunt: simply scaling large language models (LLMs) on internet text is a dead end.
Taken together, these stories point in the same direction:
The industry is over-rotated on a single bet (infinite LLM scaling), over-exposed to a single vendor (Nvidia), and under-prepared for what happens if both assumptions fail.
Nvidia’s Groq deal is about securing the next phase of AI hardware. The scaling debate is about whether there will be software and business models worth running on it.
Nvidia’s $20 Billion Non-Acquisition
What Nvidia Actually Bought
Based on current reporting, Nvidia effectively acquired three things from Groq:
- A non-exclusive license to Groq’s LPU (Language Processing Unit) designs
- The founder and senior leadership, who are moving to Nvidia
- Time — the ability to integrate inference-optimized ideas into Nvidia’s roadmap faster than if it built them in-house
What Nvidia did not buy is Groq Inc. itself. Formally, Groq remains an independent entity with a new CEO.
Why structure the deal this way?
Nvidia already dominates the AI accelerator market. A full acquisition of a high-profile challenger would have been a flashing red light for antitrust regulators in the U.S. and Europe. A licensing deal plus key hires gives Nvidia what it wants — IP and talent — while keeping Groq technically alive.
From a regulatory perspective, it maintains the appearance of competition. From a strategic perspective, it pulls a differentiated architecture inside the Nvidia orbit before it can mature into something truly threatening.
Why Groq Was Different
Most AI accelerators — including Nvidia’s — are descendants of GPU architectures originally designed for graphics and later adapted for training large neural networks.
Groq took a different path: a deterministic, inference-first architecture.
A few of the design ideas, simplified:
- Inference-optimized: Built to run trained models at predictable, low latency rather than maximize training throughput
- Massive on-chip bandwidth: Heavy use of on-chip SRAM with extremely high bandwidth, reducing dependence on slower external memory
- Deterministic execution: The same latency, every time — critical for real-time systems such as trading, autonomous control, and interactive agents
Groq was not about taking Nvidia’s training crown. It was about owning real-time AI inference just as inference economics are starting to matter more than training.
That is the context for the reported $20B price tag. Groq’s last private valuation was around $6.9B. Nvidia is effectively paying a triple to neutralize a differentiated competitor and fold its approach into the Nvidia stack.
It is not a classic acquisition. It is a pre-emptive absorption.
The Inference War Has Started
Public discussion of AI hardware tends to fixate on training: the eye-catching “it cost $100M to train this model” numbers.
But the unit economics of AI are dominated by inference:
- Training a frontier model is expensive, but done a handful of times
- Running that model for millions of users is recurring and scales with usage
Estimates of OpenAI’s daily inference spend already run into the hundreds of thousands of dollars, and that was before the latest wave of products. As usage grows, inference will account for the majority of compute spend.
That is why inference-optimized hardware matters. It is also why every major cloud provider is building its own chips to reduce reliance on Nvidia.
How the Hyperscalers Are Responding
A high-level snapshot of the custom silicon landscape:
- Google — TPU v7 “Ironwood” Google’s latest TPU generation advertises performance competitive with, and in some workloads slightly ahead of, Nvidia’s B-series GPUs. Anthropic has committed to using up to one million or more TPUs in a multi-year deal described as being worth “tens of billions of dollars.” Google also runs its own models at scale on TPUs, which creates a tight feedback loop between hardware and software.
- Amazon — Trainium 2 and 3 Trainium 2 is generally available; Trainium 3, built on a 3nm process, promises significant performance and efficiency gains. AWS claims its in-house accelerators already underpin a multibillion-dollar business. How much of that is displacement of Nvidia vs. incremental demand remains an open question.
- Microsoft — Maia Maia 200 has reportedly slipped to 2026, with some earlier plans scaled back or reworked. Public reporting suggests friction in pinning down requirements as OpenAI’s needs evolved.
- Startups — Cerebras, SambaNova, Tenstorrent, others Architecturally interesting, but facing long sales cycles, capital constraints, and the gravitational pull of Nvidia’s mature ecosystem (CUDA, cuDNN, libraries, tooling).
The consistent pattern: escaping Nvidia’s orbit quickly is extremely hard. The software ecosystem, tooling, and accumulated developer familiarity are formidable moats.
Nvidia understands this. The Groq transaction is best seen as a move to close a high-quality “escape hatch” before customers can migrate meaningful inference workloads through it.
The Civil War Over Scaling
Underneath all of this is a deeper conflict: does simply scaling up LLMs continue to work, or are we approaching structural limits?
There are three underlying questions:
- Do we have enough high-quality training data left to keep scaling in the same way?
- Do we still get economically meaningful returns from more compute?
- Can synthetic data (AI training on AI-generated content) safely replace human data at scale?
Different leaders are giving starkly different answers.
The “Scaling Is Hitting Limits” View
- Ilya Sutskever (co-founder and former chief scientist, OpenAI) has said: “Pre-training as we know it will unquestionably end. We have but one internet. The data is not growing.”
- Yann LeCun (long-time Meta Chief AI Scientist, now pursuing alternative architectures) describes large language models as a “dead end” for building truly intelligent systems. They can manipulate language but do not acquire robust models of the physical world from text alone.
- Geoffrey Hinton (Nobel laureate, former Google) has stressed the need for other learning paradigms — self-play, world models, richer sensory input — and assigns a non-trivial chance (10–20%) that advanced AI systems could lead to catastrophic outcomes if misaligned.
Empirical work supports the data-constraint side of this argument. Forecasts by groups like Epoch AI suggest that high-quality, human-generated text suitable for pre-training will be effectively exhausted between 2026 and 2032, depending on what one counts as “usable.”
Synthetic data helps, but introduces “model collapse” risks as systems train on their own outputs and propagate their own errors.
The “There Is No Wall” View
- Sam Altman (CEO, OpenAI) has argued that “there is no wall,” and that spending more money continues to yield predictable capability gains.
- Dario Amodei (CEO, Anthropic) is similarly bullish, arguing that synthetic data pipelines can generate effectively unlimited training data and that AI could compress multiple decades of scientific progress into a single decade.
- Demis Hassabis (CEO, Google DeepMind) advocates pushing scaling “to the maximum,” but also acknowledges that current systems lack robust reasoning, planning, and long-term memory, and will need architectural innovation on top of raw scale.
In this view, the scaling curve may bend but has not broken. You keep pushing model size, training time, and data volume, and new capabilities continue to emerge.
What the Evidence Suggests So Far
The emerging picture is mixed:
- Scaling still tends to improve performance, but with more sharply diminishing returns than earlier waves
- Data constraints are real at frontier scale
- Synthetic data can help, but only with careful quality control and feedback mechanisms
- Safety, reliability, and robustness do not automatically improve with size
If the “no wall” thesis is overstated, the implications for hardware demand, capex, and valuations are substantial.
The $500 Billion Capex Problem
Big Tech’s capital expenditure numbers are in their own category now.
Across 2025–2027, major hyperscalers (Amazon, Microsoft, Alphabet, Meta, and others) are projected to spend in the neighborhood of $1.15 trillion on AI-driven infrastructure. Year-by-year figures differ by source, but the direction is consistent: spending is accelerating sharply.
Indicative ranges based on current guidance and analyst estimates:
- Amazon: roughly $120–130 billion in 2025 capex, up around 60% year-over-year
- Microsoft: roughly $80 billion, up about 50%
- Alphabet: roughly $75–90 billion, up over 40%
- Meta: around $70 billion, up close to 80–90%
Several analyses, including work from Bain & Company, estimate that to justify this build-out, the ecosystem will need on the order of $2 trillion in annual AI-related revenue by 2030.
For comparison, the combined 2024 revenue of Amazon, Apple, Alphabet, Microsoft, Meta, and Nvidia was below that number.
The ROI Gap
The most worrying datapoint is not on the spending side but on the return side.
An MIT study in 2025 found that roughly 95% of surveyed organizations reported no measurable ROI from their generative AI deployments, despite collectively investing tens of billions of dollars in pilots, proofs-of-concept, and tooling.
At the same time, insiders are not exactly complacent:
- Sam Altman has acknowledged that investors are likely “overexcited” about AI right now.
- Bret Taylor, OpenAI’s board chair, has said it is both true that AI will transform the economy and that we are in a bubble where “a lot of people will lose a lot of money.”
The risk is not that AI fails to matter. The risk is that the timing and distribution of returns do not match the capital being deployed in this cycle.
Circular Demand
A further complication is the circular nature of some spending flows:
- Nvidia invests in AI labs and startups
- Those labs use that capital to buy Nvidia GPUs
- Nvidia books the revenue and reinforces its growth narrative
- Higher valuations make it easier to invest in more customers
Nothing about this structure is inherently improper. But it does blur the line between true end-customer demand and ecosystem-financed demand. When credit conditions tighten or growth expectations reset, circular flows can unwind very quickly.
The US–China Compute Divide
All of this plays out inside a geopolitical race that is increasingly focused on compute capacity.
A 5:1 Compute Advantage — For Now
Export controls on high-end AI accelerators give the U.S. and its allies an estimated five-to-one advantage in frontier compute over China, at least through the mid-2020s. Some analyses put the U.S. share of advanced AI compute around 70–75%, with China at roughly 14–15%.
Get Zoom In AI’s stories in your inbox
Join Medium for free to get updates from this writer.
Huawei’s Ascend line is gradually improving:
- The Ascend 910C is estimated to reach a significant fraction of Nvidia H100 performance on some workloads
- Memory configurations now include 128GB HBM3
- Yield issues appear to be improving, though domestic fabrication still trails leading-edge nodes
China is responding with policy rather than parity:
- State-funded data centers have been instructed to phase out foreign AI accelerators
- Some cities and provinces have introduced local content requirements (e.g., 50–70% domestic chips)
- Beijing continues to pour capital into domestic GPU and accelerator efforts
Nvidia’s data center share in China has already fallen from near-total dominance to something closer to half, driven by both export controls and local substitution.
Rare Earths and Materials Risk
In late 2025, China imposed export licensing requirements on products containing Chinese rare earths, explicitly including advanced semiconductors. China currently dominates both production and processing of these materials.
The immediate consequences:
- Stockpiling and hedging by equipment makers and chip designers
- Price spikes in several critical inputs
- Renewed calls in the U.S., EU, Japan, and elsewhere to diversify supply chains
This is not just macro background noise. It directly affects whether the capex being poured into AI infrastructure runs into physical and political constraints that optimistic adoption models do not fully capture.
If your AI strategy assumes cheap, abundant, and geopolitically neutral compute, that assumption deserves scrutiny.
Three Ways This Could Play Out
These are rough probabilities, not precise forecasts, but they are a useful way to think about the range of outcomes.
1. Bull Case — The Virtuous Cycle Holds (≈ 40%)
- Scaling continues to deliver meaningful capability gains
- Synthetic data and architectural improvements offset data and performance limits
- Inference efficiency improves enough to make economics work at scale
- “Must-have” AI applications emerge that justify the infrastructure build-out
Winners: Nvidia, the hyperscalers, and a small number of AI-native companies with real moats and clear business models.
2. Base Case — The Long Plateau (≈ 45%)
- Scaling yields diminishing but still positive returns
- Many enterprises struggle to move from pilot to production and to quantify ROI
- Capex continues for a time, then is moderated as CFOs push for discipline
- AI proves transformational in some verticals, incremental in many others
- Revenue eventually catches up, but with a multi-year lag
Winners: diversified players that can absorb a messy middle period and companies that design for efficiency and domain fit rather than pure hype.
3. Bear Case — The Great Correction (≈ 15%)
- Scaling walls prove more severe than the optimists expected
- Synthetic data and new architectures do not change the trajectory quickly enough
- Capex cuts hit GPU orders hard; Nvidia’s growth narrative breaks
- Large write-downs cascade through balance sheets across the stack
- AI re-rates from “new electricity” to an over-built utility, at least for a cycle
Winners: patient capital, value investors, and whoever owns the right assets when valuations reset.
What This Means in Practice
For Builders
- Assume inference won’t be cheap by default. Design pricing and business models that work under conservative compute-cost assumptions.
- Avoid hard lock-in where you can’t afford it. Even if you standardize on Nvidia today, keep an eye on portability at the software layer.
- Think beyond “call the API.” Retrieval, distillation, compression, and task-specific models are as important as access to a frontier endpoint.
For Investors
- Separate GPU-driven growth from genuine product-market fit. “We raised a round to buy more GPUs” is not a business model on its own.
- Look at realized revenue and retention, not just headline partnerships. Multi-year “up to” commits are optionality, not cash.
- Watch for circular demand. Follow the money all the way to the end customer.
For Engineers
- Lean into efficiency. Skills in quantization, pruning, distillation, caching, and retrieval-augmented design will age well.
- Expect platform volatility. Hardware, runtimes, and model providers will keep shifting. Build abstractions, but build them carefully.
- Stay grounded in fundamentals. Systems, networks, data infrastructure, and distributed computing will matter as much as model tinkering.
Closing Thoughts
The AI industry is, at the same time:
- Building what may become the most important technology platform of this century
- Potentially recreating the largest over-investment cycle since the dot-com era
- Engaged in a U.S.–China compute race with no clear endpoint
- Arguing internally about whether its core scaling thesis is even correct
Nvidia’s Groq move is a defensive masterstroke. It pulls a differentiated inference architecture into Nvidia’s sphere without triggering an obvious antitrust chokepoint.
The scaling debate is not academic. If Sutskever, LeCun, and Hinton are closer to right than wrong, a meaningful share of today’s capex cycle is mispriced.
The U.S.–China split is accelerating. By 2030, we may be looking at two partially incompatible AI ecosystems with different hardware, standards, and regulatory regimes.
The $2 trillion question — whether AI revenues can justify the infrastructure being built — remains unanswered.
The next two to three years will determine whether this period looks, in hindsight, more like the early internet (massive over-building that eventually paid off) or the pure-bubble end of 1999.
The gap between those two outcomes is measured in trillions of dollars.
This analysis is independent research synthesizing public financial filings, analyst reports, and verified news sources. It is not financial advice.
References & Sources
Nvidia–Groq Deal
- CNBC — “Nvidia buying AI chip startup Groq’s assets for about $20 billion in its largest deal on record” https://www.cnbc.com/2025/12/24/nvidia-buying-ai-chip-startup-groq-for-about-20-billion-biggest-deal.html
- TechCrunch — “Nvidia to license AI chip challenger Groq’s tech and hire its CEO” https://techcrunch.com/2025/12/24/nvidia-acquires-ai-chip-challenger-groq-for-20b-report-says/
- Yahoo Finance — “Breaking down Nvidia’s unusual $20 billion deal with Groq” https://finance.yahoo.com/news/nvidia-acquire-groq-20-billion-214927907.html
- SiliconANGLE — “Nvidia to license technology from inference chip startup Groq in reported $20B deal” https://siliconangle.com/2025/12/24/nvidia-license-technology-inference-chip-startup-groq-reported-20b-deal/
Scaling Debate & AI Leaders
- Sam Altman (X/Twitter) — “there is no wall” https://x.com/sama/status/1856941766915641580
- Business Insider — “Sam Altman says ‘there is no wall’ in an apparent response to fears of an AI slowdown” https://www.businessinsider.com/sam-altman-there-is-no-wall-ai-slowdown-2024-11
- DeepNewz — “Ilya Sutskever Declares ‘Pre-Training as We Know It Will End’ at NeurIPS 2024” https://deepnewz.com/ai-modeling/ilya-sutskever-declares-pre-training-we-know-end-neurips-2024-citing-peak-data-18418711
- 36Kr — “Turing Award Winner Yann LeCun: Large Models a ‘Dead End’” https://eu.36kr.com/en/p/3571987975018880
- The Decoder — “The case against predicting tokens to build AGI” https://the-decoder.com/the-case-against-predicting-tokens-to-build-agi/
- The Information Bottleneck Podcast — EP20: Yann LeCun https://www.the-information-bottleneck.com/ep20-yann-lecun/
- Wikipedia — Geoffrey Hinton https://en.wikipedia.org/wiki/Geoffrey_Hinton
Hyperscaler Capex & AI Investment
- Goldman Sachs — “Why AI Companies May Invest More than $500 Billion in 2026” https://www.goldmansachs.com/insights/articles/why-ai-companies-may-invest-more-than-500-billion-in-2026
- CreditSights — “Technology: Hyperscaler Capex 2026 Estimates” https://know.creditsights.com/insights/technology-hyperscaler-capex-2026-estimates/
- Invezz — “Looking ahead to 2026: why hyperscalers can’t slow spending without losing the AI war” https://invezz.com/news/2025/12/26/looking-ahead-to-2026-why-hyperscalers-cant-slow-spending-without-losing-the-ai-war/
- IO Fund — “Big Tech’s $405B Bet: Why AI Stocks Are Set Up for a Strong 2026” https://io-fund.com/ai-stocks/ai-platforms/big-techs-405b-bet
- CNBC — “How the AI market could splinter in 2026” https://www.cnbc.com/2025/12/25/how-the-ai-market-could-splinter-in-2026-.html
AI Bubble and ROI Concerns
- Longbridge — “OpenAI Board Chairman: We are indeed in an ‘AI bubble’” https://longbridge.com/en/news/257277234
- DigitrendZ — “OpenAI Chair Bret Taylor: We’re in an AI Bubble (And That’s Okay)” https://digitrendz.blog/newswire/artificial-intelligence/46023/openai-chair-bret-taylor-were-in-an-ai-bubble-and-thats-okay/
- IEEE ComSoc Techblog — “AI spending boom accelerates: Big tech to invest an aggregate of $400 billion in 2025” https://techblog.comsoc.org/2025/11/01/ai-spending-boom-accelerates-big-tech-to-invest-invest-an-aggregate-of-400-billion-in-2025-more-in-2026/
- MIT / Axios / Entrepreneur coverage of enterprise AI ROI (e.g., Axios: https://www.axios.com/2025/08/21/ai-wall-street-big-tech)
Nvidia Strategy & “Virtuous Cycle”
- CNBC — “Nvidia CEO Jensen Huang says AI is in a ‘virtuous cycle’” https://www.cnbc.com/2025/10/31/nvidia-ceo-jensen-huang-says-ai-has-reached-a-virtuous-cycle.html
- Bloomberg via Yahoo Finance — “Nvidia CEO Downplays AI Bubble Fears as He Enlists New Partners” https://finance.yahoo.com/news/nvidia-ceo-rebuts-fears-ai-194608223.html
US–China AI Chip Competition
- Council on Foreign Relations — “China’s AI Chip Deficit: Why Huawei Can’t Catch Nvidia and U.S. Export Controls Should Remain” https://www.cfr.org/article/chinas-ai-chip-deficit-why-huawei-cant-catch-nvidia-and-us-export-controls-should-remain
- CSIS — “DeepSeek, Huawei, Export Controls, and the Future of the U.S.–China AI Race” https://www.csis.org/analysis/deepseek-huawei-export-controls-and-future-us-china-ai-race
- RAND — “Leashing Chinese AI Needs Smart Chip Controls” https://www.rand.org/pubs/commentary/2025/08/leashing-chinese-ai-needs-smart-chip-controls.html
- SemiAnalysis — “Huawei Ascend Production Ramp: Die Banks, TSMC Continued Production, HBM is The Bottleneck” https://semianalysis.com/2025/09/08/huawei-ascend-production-ramp/
- Georgetown CSET — “Pushing the Limits: Huawei’s AI Chip Tests U.S. Export Controls” https://cset.georgetown.edu/publication/pushing-the-limits-huaweis-ai-chip-tests-u-s-export-controls/
- Institute for Progress — “The H20 Problem: Inference, Supercomputers, and US Export Control Gaps” https://ifp.org/the-h20-problem/
AI Chip Competitors
- AWS — Trainium https://aws.amazon.com/ai/machine-learning/trainium/
- Google Cloud Blog — “3 things to know about Ironwood, Google’s latest TPU” https://blog.google/products/google-cloud/ironwood-google-tpu-things-to-know/
- TechCrunch — “Andy Jassy says Amazon’s Nvidia competitor chip is already a multibillion-dollar business” https://techcrunch.com/2025/12/03/andy-jassy-says-amazons-nvidia-competitor-chip-is-already-a-multi-billion-dollar-business/