SGNL Intelligence.
EN 中文
8 min read

The Invisible Bottleneck: Why AI's Next Crisis Is About Light and Logic, Not GPUs

Optical NetworkingCPOCPU ShortageAgentic AINVIDIAAMDIntelSupply Chain

Every NVIDIA GPU sold today needs its own laser. And the world is running out of lasers.

It also needs a CPU to tell it what to do. And the world is running out of those, too.

For three years, the AI industry has been obsessed with a single bottleneck: GPUs. Who can build the most? Who can buy the most? Which architecture wins? But something strange is happening in 2026. The GPU supply chain is finally catching up — and two “invisible” infrastructure layers are breaking instead.

The wires are too slow. The conductors are too few. Welcome to the great bottleneck rotation.


1. The Problem Nobody Saw Coming

Think of an AI data center like an orchestra. The GPUs are the musicians — powerful, expensive, and the center of attention. But an orchestra also needs a concert hall with perfect acoustics (the network) and a conductor keeping everyone in sync (the CPU).

For years, we built bigger orchestras without upgrading the concert hall or hiring more conductors. It worked when 8 GPUs sat on a single server. But now we’re building clusters of 100,000 GPUs — and the acoustics are terrible and the conductor can’t keep up.

Where the Bottleneck Lives
2023–2024GPUs & HBM
GPU allocation wars, HBM3 supply constrained by SK Hynix capacity
2025Advanced Packaging
TSMC CoWoS capacity limits GB200 production ramps
2026Optics + CPUs
Transceiver supply 2x short, server CPUs sold out for the year
Constraint severity (relative)

Two parallel crises are unfolding:

The Light Crisis. Copper cables physically cannot carry data fast enough between 100K GPUs. The industry needs photonic interconnects — lasers instead of electrons. But laser production is constrained by a rare material called indium phosphide (InP), and transceiver demand already exceeds supply by 2x.

The Logic Crisis. AI is shifting from chatbots to autonomous agents that plan, call APIs, search the web, write code, and coordinate with other agents. All that orchestration runs on CPUs. A Georgia Tech / Intel study found that CPUs handle 50–90% of total latency in agentic workloads. And server CPUs are sold out for the entirety of 2026.


2. The Light Revolution: Why Every GPU Needs a Laser

Here’s the number that explains the optical supercycle in a single sentence:

1 GPU = 1 optical transceiver. 100K GPUs = 100K lasers.

NVIDIA’s GB200 NVL72 rack — the standard building block of modern AI clusters — requires 72 optical transceivers, one per GPU. Each transceiver uses a tiny laser to convert electrical signals into pulses of light that travel over fiber at 800 Gbps. The previous generation ran at 400G. The next needs 1.6 Tbps.

The problem? Making these lasers requires indium phosphide — a specialized semiconductor material with limited global production. McKinsey projects 800G transceiver production will fall 40–60% short of demand through 2027, with 30–40% shortfalls in 1.6T supply through 2029.

Optical Transceiver Supply vs Demand
800G Transceivers (2026–2027)40–60% shortfall
Demand
Supply
1.6T Transceivers (2027–2029)30–40% shortfall
Demand
Supply
Current demand exceeds supply by 2x due to indium phosphide (InP) laser production constraints. Source: McKinsey

The industry’s answer is co-packaged optics (CPO) — integrating the laser and photonic circuitry directly onto the same package as the switch or processor. Instead of plugging in a separate transceiver module, the optics become part of the silicon itself.

Think of it this way: today’s approach is like screwing a light bulb into a lamp. CPO is embedding the LED directly into the circuit board. Fewer connections, less power, more bandwidth.

NVIDIA is going all in. In March 2026, Jensen Huang announced $4 billion in parallel investments:

  • $2B in Coherent Corp — multibillion-dollar CPO supply agreement through 2030. Shares surged 15% to an all-time high.
  • $2B in Lumentum Holdings — focused on silicon photonics and a new U.S. fab. Shares closed 12% higher.

Jensen called it “pioneering next-generation silicon photonics to enable AI infrastructure at unprecedented scale.”

The market data backs the thesis. Yole Group projects the CPO market growing from $46 million in 2024 to $8.1 billion by 2030 — a jaw-dropping 137% CAGR. The broader data center photonics TAM is on track to hit $16B by 2028. STMicroelectronics announced high-volume production of its PIC100 silicon photonics platform. NVIDIA’s Quantum-X CPO InfiniBand switches are commercially available now, with Spectrum-X Ethernet CPO following in H2 2026.

OFC 2026 — the optical industry’s biggest conference — expects 16,000 attendees and 700+ exhibitors. 2026 is the first year of volume 1.6 Tbps switch deployments, with 800G market share climbing from 7% to 35% in a single year — nearly 10x growth.

This isn’t speculation. It’s a supply scramble.


3. The CPU Comeback: When Agents Need Conductors

Here’s the part that surprises everyone: the world’s biggest AI companies are suddenly desperate for CPUs.

Not GPUs. CPUs. The “boring” chip that’s been in every computer since the 1970s.

The reason is agentic AI. When you ask ChatGPT a simple question, the GPU does most of the work — it generates tokens. But when you ask an AI agent to “research competitors, build a spreadsheet, email the team, and schedule a follow-up” — that’s a completely different workload:

  • Plan the task decomposition (CPU)
  • Call APIs to search the web and query databases (CPU)
  • Execute tools — run code, process files, parse data (CPU)
  • Coordinate between specialized sub-agents (CPU)
  • Generate text at each step (GPU)

A Georgia Tech / Intel research paper (arXiv:2511.00739) measured this directly across five real agentic workloads. The finding: CPUs handle 50–90.6% of total latency. The GPU sits idle most of the time, waiting for the CPU to finish orchestrating.

Server CPU Supply Crunch
Intel30+ week lead time
Largely sold out
"The CPU has become cool again"David Zinsner, CFO, Morgan Stanley TMT Conference (March 2026)
AMD10+ week lead time
Demand far exceeded forecasts
"We are in the process of catching up"Lisa Su, CEO, Morgan Stanley TMT Conference (March 2026)
>50%
EPYC Turin share of AMD server CPU revenue
+50%
AMD cloud instance growth (YoY)
10–15%
KeyBanc Intel ASP upside estimate
50–90%
Agentic workflow latency on CPU

The supply shock is real. At the Morgan Stanley TMT Conference in March 2026:

  • AMD CEO Lisa Su disclosed server CPU demand has “far exceeded” forecasts. 5th Gen EPYC Turin crossed 50% of server CPU revenue. Cloud instances grew 50%+ YoY. AMD’s lead time: 8–10 weeks.
  • Intel CFO David Zinsner said “The CPU has become cool again” — then admitted Intel “misjudged” demand and is “absolutely constrained.” Intel’s 2026 server CPU capacity is “largely sold out.” Lead time: over 30 weeks.

KeyBanc upgraded both Intel and AMD to Overweight, citing server CPUs being “mostly sold out for 2026.” They see potential for Intel to raise prices 10–15% on the shortage.

And here’s the kicker: Meta became the first tech giant to deploy standalone NVIDIA Grace CPUs — without paired GPUs — specifically for agentic AI. That’s a signal that agents are creating an entirely new category of CPU-only servers.


4. The $602 Billion Connection

These two bottlenecks aren’t independent. They’re connected by the same root cause: the largest infrastructure investment wave in computing history.

How $602B Capex Cascades Through the Stack
Top 5 Hyperscaler Capex (2026)$602B (+36% YoY)
1
GPU & AcceleratorsHBM, TSMC wafers, CoWoS packaging
2
Optical Networking3–5 transceivers per GPU, InP lasers, CPO modules
3
Server CPUs~1:1 CPU-to-GPU for agentic workloads, orchestration
4
Power & CoolingUp to 600 kW per rack, grid upgrades, liquid cooling
Every GPU purchased creates multiplicative downstream demand.
The constrained layer captures the most incremental value.

The top 5 hyperscalers are deploying $602 billion in capex during 2026 — up 36% year-over-year — with roughly $450B going to AI infrastructure. Every dollar triggers a cascade:

  1. Buy GPUs — creates demand for HBM, TSMC wafers, advanced packaging
  2. Every GPU needs transceivers — 3–5 optical links per chip — creates demand for lasers, InP, silicon photonics
  3. Every GPU needs a conductor — agentic workloads need ~1:1 CPU-to-GPU ratios — creates demand for EPYC, Xeon, Grace
  4. Every rack needs power — up to 600 kW per AI rack — creates demand for grid upgrades, cooling

The bottleneck rotates through this stack. In 2023–2024, it was GPUs and HBM. In 2025, it was TSMC advanced packaging. In 2026, it’s optical networking and CPUs. Each layer takes turns being the constraint.

Inference is accelerating the rotation. Deloitte projects inference at two-thirds of all AI compute in 2026 (up from one-third in 2023), heading toward an eventual 80/20 inference-to-training steady state. Inference is bursty, latency-sensitive, and increasingly agent-driven — exactly the workload profile that stresses both networking bandwidth and CPU orchestration simultaneously.


5. The Signal

Three high-conviction patterns emerge from the data:

The optical supercycle is supply-constrained, not demand-constrained. Transceiver demand exceeds supply by 2x today. McKinsey sees 40–60% gaps through 2027. The CPO market is growing at 137% CAGR. NVIDIA’s $4B investment, STMicro’s volume production, and 700+ OFC exhibitors all point to the same thing: this isn’t speculative demand — it’s a supply scramble for a technology the industry physically cannot build fast enough.

Server CPUs are the “forgotten chip” no more. Intel and AMD are both sold out for 2026. Lead times have ballooned to 30+ weeks at Intel. Meta is buying standalone CPUs for agents. The Georgia Tech data showing 50–90% CPU latency share gives this a rigorous technical foundation. Agentic AI is structurally CPU-hungry, and the industry underestimated it.

The bottleneck rotation is the investment thesis. Every GPU sold creates multiplicative demand downstream: 1 transceiver, 1 CPU, 600 kW of power. As the industry scales from 7.3M GPU-equivalents toward tens of millions, the companies that supply the “invisible” layers — Coherent, Lumentum, STMicro, AMD, Intel, and the power infrastructure players — may capture more incremental value than the GPU makers themselves. The $602B wave lifts the entire stack, but it lifts the constrained layers the most.

Analysis powered by GIKE (General Iterative Knowledge Engine). Market data sourced from McKinsey (transceiver supply gaps), Yole Group (CPO TAM at 137% CAGR), Dell’Oro (optical transport market), Deloitte (inference compute split), Goldman Sachs (hyperscaler capex), and KeyBanc Capital Markets (CPU supply analysis). Company data from Morgan Stanley TMT Conference (AMD CEO Lisa Su, Intel CFO David Zinsner), NVIDIA newsroom (Coherent/Lumentum investments, Meta Grace deployment), and Georgia Tech/Intel research (arXiv:2511.00739). Claims cross-referenced across 15 independent sources.

Get the signal, not the noise

New analysis delivered to your inbox. No spam, unsubscribe anytime.