They're Playing Different Games
When people talk about Apple vs NVIDIA in AI hardware, they often frame it as a direct competition. It isn't — at least not yet. NVIDIA's business is selling chips to cloud providers, hyperscalers, and enterprises building AI infrastructure. Apple's business is selling devices to people, and making those devices smart enough that people keep buying them.
But 2025 blurred those lines. NVIDIA moved deeper into inference (running AI models, not just training them). Apple started building its own AI server chips for the first time. Both companies are now competing for the same long-term prize: who controls where AI actually runs.
The core tension
Training AI models requires massive GPU clusters — NVIDIA's turf. Running AI models (inference) can happen anywhere: in the cloud, on a server, or on your phone. The fastest-growing part of the AI chip market is inference, and Apple has been very good at it for years without anyone noticing.
NVIDIA's Blackwell Era
NVIDIA shipped its Blackwell GPU generation through 2025 and it was a significant jump. The B200 GPU delivers around 20 petaFLOPS of compute in FP4 — roughly double what the previous H100 generation offered. More importantly for inference, Blackwell can churn through language model tokens at many times the rate of earlier chips.
The demand was extraordinary. NVIDIA reportedly sold out its entire 2025 Blackwell production before the year even started. Microsoft, Google, Meta, and Amazon collectively spent hundreds of billions acquiring these chips to power their AI services.
NVIDIA Blackwell — key facts
- B200 GPU: ~20 petaFLOPS in FP4, 192 GB HBM3e memory
- Roughly 2x throughput vs previous H100/H200 generation
- Sold out before shipping — demand exceeded supply all year
- Data center revenue hit $51 billion in a single quarter (Q3 2025)
- Next generation (Rubin) already announced for 2026, targeting 50 petaFLOPS
NVIDIA also announced its next architecture — Rubin — for 2026, promising another major performance jump. At this pace, NVIDIA is shipping a new major GPU generation roughly every 12 months. That's an aggressive roadmap and a hard thing to compete with.
Apple's M5 and the On-Device Push
Apple released the M5 chip line in late 2025. The headline number: the Neural Engine runs at 38 trillion operations per second, more than double the M3. For comparison, that's still nowhere near NVIDIA's data center chips — but that comparison misses the point entirely.
The M5's strength is doing a lot with a little. Running a local language model on a MacBook Pro using M5 Max costs a fraction of what it costs to hit a cloud API. There's no latency from a network round-trip. There's no privacy concern about sending your data to a server. And it works offline.
Apple M5 — key facts
- Neural Engine: 38 TOPS — more than 2x the M3
- M5 Max: up to 614 GB/s memory bandwidth, 128 GB unified memory
- LLM inference: up to 4x faster time-to-first-token compared to M4
- Each GPU core includes a dedicated Neural Accelerator — a first for Apple
- Built on TSMC's 3nm N3P process for strong efficiency
Apple also opened up on-device AI to developers in June 2025 with the Foundation Models API. This lets third-party apps run small language models locally without any server costs or internet connection. It's a quiet but meaningful move — Apple is turning its hardware edge into a developer platform.
Apple Intelligence
Apple's consumer AI story runs through Apple Intelligence, which launched in late 2024 and expanded significantly through 2025. It uses a hybrid model: simple tasks (summarising a text, writing suggestions) happen entirely on the device. More complex requests get routed to Apple's Private Cloud Compute servers, where they're handled and then deleted — Apple's answer to privacy concerns around cloud AI.
Most users will never think about any of this. They just notice that Siri got better, their photos are easier to search, and emails now draft themselves. That's the Apple approach — hide the hardware wins inside product experiences.
Apple Quietly Enters the Data Center
This is the part of the story that gets less attention. Apple has been building its own AI server infrastructure, and in 2025 it started getting serious about it.
A 250,000 square foot manufacturing facility in Houston, Texas began shipping Apple-built AI servers ahead of schedule. These servers power Private Cloud Compute — the cloud side of Apple Intelligence. Apple describes them as "incredibly energy-efficient," which matters for both cost and their 2030 carbon-neutral goal.
Beyond that, Apple is working with Broadcom on a custom AI chip codenamed "Baltra" aimed at larger-scale data center use. It's expected to go into mass production around 2026–2027. This would be Apple's first chip designed specifically for AI inference at scale — not for a Mac or an iPhone, but for a server rack.
Why this matters
Apple doesn't need to sell AI chips to compete with NVIDIA. They just need to stop buying NVIDIA chips. Every Apple Intelligence request that runs on Apple's own servers instead of rented cloud GPU capacity is one fewer GPU sale for someone else. If Apple's custom server chips are as efficient as their consumer chips, this becomes a real cost advantage.
The Market Cap Tug of War
For a brief stretch in 2025, NVIDIA became the world's most valuable company — the first to hit a $4 trillion market cap. It was a remarkable milestone for a chip company that most people outside the tech industry had never heard of a decade ago.
Apple wasn't far behind. By late 2025, Apple's market cap was within striking distance of NVIDIA's, hovering around $4.1 trillion. The two companies have been trading places as the world's most valuable company throughout the year.
| Factor | NVIDIA | Apple |
|---|---|---|
| Primary AI market | Data center training & cloud inference | On-device & edge inference |
| Latest AI chip | Blackwell B200 (20 petaFLOPS FP4) | M5 / M5 Max (38 TOPS Neural Engine) |
| Revenue driver | Data center GPU sales ($51B in one quarter) | Consumer devices (AI baked into products) |
| Market share | ~92% of discrete AI training GPUs | Unchallenged in consumer edge AI |
| Software ecosystem | CUDA — dominant, deeply entrenched | MLX, Foundation Models API — growing |
| Data center plans | Already the dominant supplier | Custom "Baltra" chip, mass production ~2027 |
| Market cap (late 2025) | ~$4.2 trillion | ~$4.1 trillion |
Who Actually Wins?
The honest answer is: they're probably both going to win, just in different places.
NVIDIA's position in AI training is close to unassailable in the near term. Building a chip that can compete with Blackwell for training large models is a years-long engineering effort, and NVIDIA is already working on what comes after Rubin. The CUDA software ecosystem, built up over 15 years, is another enormous moat. Companies writing AI code write it for CUDA first.
But inference is different. Running a model is a less specialised task than training one, and the market for inference chips is much broader. On-device inference — running AI on phones, laptops, and embedded devices — is an area where Apple has a genuine structural advantage. Their chips, tightly integrated with their operating system and hardware, are more efficient per watt than anything NVIDIA makes for that use case.
The interesting question is what happens as AI models get smaller and more capable. Right now, the most powerful AI requires massive NVIDIA clusters. But the trend is toward smaller, more efficient models that can run closer to the edge. If that trend continues, Apple's decade of investment in Neural Engines starts to look less like a consumer feature and more like a very long head start.
For now, NVIDIA prints money on the AI training boom. Apple prints money on the device upgrade cycle, with AI becoming a stronger reason to upgrade every year. Watch what happens when Apple's data center ambitions get serious — that's when this rivalry gets genuinely competitive.