Nobody buys software from Intel

February 05, 2026 • ai, llm, opinion

When did you last care which chip was in your laptop? Not which laptop, or which operating system, but which actual silicon processor was doing the work. Unless you’re a gamer or someone running heavy compute workloads, the answer is probably never. You bought the machine because of what it could do, not because of what was inside it. Intel and AMD are invisible.

I’ve been thinking about this a lot recently, because I think it’s where LLM providers are heading. The companies pouring billions into training frontier models are building the CPUs of the AI era. They employ brilliant people, burn staggering amounts of capital, and produce genuinely remarkable technology. And if history is any guide, that’s not the comfortable position it sounds like.

The substrate

Intel employs over 100,000 people. They spend north of $15 billion a year on research and development. The engineering required to design and fabricate modern processors is among the most complex work humans have ever attempted. The same is true of AMD, Qualcomm, Apple’s chip division, and the handful of other companies pushing the boundaries of what silicon can do.

And yet, nobody buys software from Intel. Outside of very specific workloads, very few users pick an application because of which chip it runs on. When you open a browser or launch an IDE, the processor underneath is an implementation detail. It’s infrastructure. It’s the substrate on which everything else grows.

This wasn’t always the case. There was a time when “Intel Inside” was a genuine selling point, when clock speed was the metric that mattered, when consumers actually cared about the difference between a Pentium III and a Pentium 4. But the value migrated upward. Operating systems, applications, services, platforms. Silicon became a commodity. Still essential, still incredibly sophisticated, but invisible.

LLMs are starting to follow the same path. The difference between GPT-4o and Claude and Gemini, for most practical tasks, is shrinking. A year ago, model choice felt like it mattered enormously. Today, I still have preferences, but I’d struggle to articulate why in terms that would survive a blind test. Ask me why I prefer Claude over GPT for coding work, and I’ll give you an answer that’s more vibes than evidence. The models are converging, and the gap narrows with every release cycle.

What Anthropic is actually selling

Claude Code is the most interesting thing Anthropic makes right now, and it isn’t a model. It’s a product. A tool you install, configure, learn to use, build habits around. The underlying model matters, obviously, but what keeps me opening my terminal every morning is the workflow, not the weights. I’ve spent thousands of dollars on Claude Code at this point. I didn’t spend that money because of the model’s benchmark scores. I spent it because the tool makes me more productive, and the experience of using it keeps getting better.

Think about what Anthropic gets from Claude Code that they don’t get from API access alone. They’re watching how people actually use the tool in real working environments. They see which workflows succeed and which ones frustrate. They see where people get stuck, where they lose trust, where they give up. That telemetry is extraordinarily valuable, and it’s driving development decisions as much (possibly more) than the underlying model research. When Anthropic decides what to improve next, they’re not just looking at abstract capability evaluations. They’re looking at what real users do in real sessions.

Consider the trajectory. Claude Code launched as a fairly basic CLI tool. Within months it had hooks, custom commands, MCP server integrations, project-level configuration. Those features didn’t come from model improvements. They came from watching people use the product and understanding what was missing. Product development, not research.

The feedback loop here is the real competitive advantage: real usage generates insight, insight improves the product, the improved product attracts more usage. This is a product flywheel, not a model flywheel. Benchmark scores don’t capture it. Training compute doesn’t explain it. It’s the accumulated understanding of how humans and AI tools actually collaborate on real work.

OpenAI is doing the same thing with Codex and ChatGPT. Google with Jules and Gemini’s integrations. Cursor has built an entire company on the premise that the product layer is where the fight happens. The model underneath is important in the way that a good engine is important in a car. Sure, there are enthusiasts who care very much about an engine being a V8, or supercharged; but among the general consumers in the car-buying demographic, nobody test-drives an engine.

Models, tools, product

If you think of the stack as three layers (models at the bottom, tools in the middle, products at the top), each layer depends on the one below it. The model layer is where the raw capability lives. The tool layer is how that capability connects to the real world: file systems, APIs, databases, code execution. The product layer is what the user actually touches.

The model layer is converging, as I said. The tool layer is where the most interesting work is happening right now. Context management, agentic workflows, knowing when to ask the user a question versus when to just act. These are the problems that separate “impressive demo” from “useful daily driver.” I’ve been using agentic coding tools for long enough now to know that the gap between those two things is enormous, and it lives almost entirely in the tool layer.

When Claude Code decides to run my test suite after making a change, that’s a tool-layer decision. When it reads my project’s Makefile to understand how I’ve configured things, that’s tool-layer intelligence. When it stops and asks me a clarifying question instead of guessing, that’s tool-layer judgement. None of these things are about the model being smarter in some abstract sense. They’re about the product being better at the practical work of collaborating with a human.

This is also where the gap between providers becomes most visible, at least for now. I’ve tested Codex, Jules, and Claude Code against the same tasks, and the differences in outcome had almost nothing to do with the underlying model’s raw capability. They came down to practical things: did the agent read the README? Did it understand how to install the dependencies? Did it run the full test suite or just a subset? These are tool-layer failures, not model-layer failures. The model was smart enough in every case. The tooling around it was what determined success or failure.

Diminishing returns

The model race will cool off. It has to.

Each generation of improvement costs more and delivers less visible gain for most users. This is the same trajectory as CPU clock speeds in the 2000s. Remember when Intel and AMD were in a raw performance war, pushing clock speeds higher and higher? They kept improving, but the improvements stopped mattering to most people. A 3GHz chip and a 3.5GHz chip felt identical for email and web browsing. The manufacturers pivoted to efficiency, power consumption, and integrated features instead.

LLMs are approaching a similar threshold. The difference between a model that scores 90% on a coding benchmark and one that scores 93% is meaningful to researchers but largely invisible to someone using the tool to build a Django application. I can feel this in my own usage. Six months ago, a new model release would change how I worked. Today, I upgrade, notice it’s a bit faster or a bit better at long context, and carry on with my day. The improvements are real. They just don’t change my behaviour anymore.

When that happens across the board, the companies that invested in the layers above the model will be the ones standing. The ones still chasing benchmark points will be selling a commodity. A very sophisticated commodity, sure. But a commodity nonetheless. And the margins on commodities are thin, no matter how clever the engineering underneath.

Or maybe they won’t let it happen

The CPU analogy has a flaw, and it’s worth being honest about it.

Intel and AMD grew up in a less aggressive era of technology capitalism. They largely accepted their role as component suppliers. They let the value migrate upward without fighting particularly hard to capture it. Intel tried to build products a few times (remember Intel’s phone chips? Their TV ambitions?) but never with the conviction needed to succeed.

LLM vendors have the benefit of hindsight. They can see what happened to chipmakers and choose differently. OpenAI, Anthropic, and Google are already vertically integrating. They’re building the models AND the tools AND the products. They’re not content to be substrate. They want to own the full stack.

This is a rational strategy, and they might pull it off. The current generation of AI companies are far more aggressive about capturing value across the entire chain than Intel ever was. They have the talent, the capital, and the motivation.

But history is full of companies that tried to own everything and failed. Vertical integration is hard to sustain when the layers above you move faster than you can. Microsoft tried to own the browser, the search engine, the phone, the social network, the music player. They succeeded at some and failed spectacularly at others. Being the best at one layer doesn’t guarantee competence at the layers above. The skills required to train a frontier model are entirely different from the skills required to build a product that people love using every day. Great research labs don’t automatically produce great products. Ask Google about that one.

The question is whether these companies can simultaneously be the best model provider, the best tool builder, and the best product company. That’s a lot of battles to fight at once. Startups like Cursor are already demonstrating that a focused team building exclusively at the product layer can compete with, and sometimes outperform, the vertically integrated giants. If the model layer truly commoditises, there will be more companies like Cursor, not fewer. Small teams with good product instincts, plugging into whichever model is cheapest or best for their use case, iterating faster than the big labs can. That’s the world Intel accidentally enabled for PC software. It might be the world that OpenAI and Anthropic accidentally enable for AI tools.

I don’t know which way this goes. Anyone who tells you they do is selling something. But I keep coming back to the CPU analogy because the structural forces feel so similar: brilliant engineering becoming invisible infrastructure, value migrating upward, the consumer caring less and less about what’s underneath.

When I open my terminal tomorrow morning, I won’t be thinking about which model is running. I’ll be thinking about whether the tool helps me get my work done. That instinct, multiplied across millions of users, is what turns a technology into a substrate. The LLM providers can see it coming. Whether they can avoid it is another question entirely.