2012-04-18

Capacity Is Not Performance

A storage-era lesson that still applies to Kubernetes, cloud platforms, and modern infrastructure design.

StoragePerformanceArchitecture

One of the most useful infrastructure lessons I learned early in my career was this:

Available capacity does not mean available performance.

That sounds obvious now, but it was not always obvious to customers staring at storage arrays with plenty of free space left.

The capacity trap

As enterprise disks got larger, it became easy to look at a storage system and think:

We still have free terabytes.
Why not put more workloads there?

From a capacity standpoint, that looked reasonable.

From a performance standpoint, it could be a disaster.

The limiting factor was often not the amount of space left on the drives. It was the number of I/O operations the system could sustain, the latency profile of the workload, the controller limits, cache behavior, queue depth, and the combined workload mix.

The Exchange example

Microsoft Exchange was a great example because it forced people to think in terms of workload behavior instead of raw capacity.

Two deployments could consume the same amount of storage but behave completely differently. One might be mostly sequential and predictable. Another might generate a heavy random I/O pattern that pushed the array much harder.

The right design question was not only:

How much data do we need to store?

It was also:

How many IOPS do we need?
What latency can the application tolerate?
What else is sharing the same backend?
What happens during backup, replication, failover, or maintenance?

Flash changed the bottleneck, but did not remove it

When enterprise flash entered the picture, a lot of disk-level pain disappeared. But the bottleneck did not magically go away. It moved.

Instead of waiting on spinning disks, systems started to expose other constraints:

storage processors
controllers
interconnects
software paths
replication overhead
metadata operations
application-level contention

That pattern still shows up everywhere.

The same lesson applies to Kubernetes

Modern platforms have their own version of the capacity trap.

A Kubernetes cluster can show available CPU, memory, or GPU capacity and still be the wrong place for a workload.

Why?

Because scheduling is not just arithmetic.

You also have to consider:

node affinity
topology
storage locality
network behavior
noisy neighbors
DaemonSet overhead
reserved capacity
GPU model compatibility
disruption budgets
upgrade windows
real workload behavior

A cluster with free allocatable resources can still be unable to safely run the next workload.

The platform engineering takeaway

Good infrastructure design is not just about measuring what is free.

It is about understanding what is safe to consume.

That means engineers need to look past simple capacity numbers and ask better questions:

What is the bottleneck?
What happens under failure?
What happens during maintenance?
What happens when multiple workloads spike at the same time?
What does the application actually need?
What are we reserving for the platform itself?

The lesson I learned in storage still shapes how I think about infrastructure today.

Capacity is important.

But performance, behavior, and operational safety are what determine whether a system actually works.