When Code Gets Cheap, Architecture Gets Expensive

Published or Updated on May 31, 2026

The cost of software was never in the typing.

I was hacking on a side project, just me, no team, and asked the agent to fix a small bug. What I got back was a subsystem: a config system, structured logging, a short README, and a unit test suite that passed on the first run. It handled edge cases I'd never hit. By every local measure it was better than the four lines I had in mind. It was also completely unnecessary, and keeping it meant that the next time I wandered back into this corner, months later, with no memory of any of it, I'd have to understand a config system just to change a default.

It wrote the tests. It added the logging. It passed CI. And the correct answer was still: delete the branch, change one line.

On a solo project there's no reviewer to catch this. No pull request, no second pair of eyes, just you and an agent happy to build infrastructure you never asked for. So you either notice in the moment or you inherit it. If you've used a coding agent in anger, you've lived this. The interesting question isn't whether it happens. It's why, and what the answer does to the way you should build. This is about that, starting with one bug fix and working up to how you should structure everything you ship now.

When code gets cheap, comprehension becomes the bottleneck, and the boundary, not the service, becomes the unit of architecture.

The Governor

The subsystem wasn't a malfunction. It's a structural bias, and it's worth being precise about the mechanism.

Before agents, the four-line fix and the full subsystem were both options, but they cost wildly different amounts: one was a minute of typing, the other an afternoon. That gap did quiet work. The friction of writing the big thing selected for the small solution most of the time.

Agents collapse the gap. The subsystem is now as cheap to produce as the one-liner, so you get the subsystem, not because it's correct, but because it's plausible. A language model produces the maximally likely continuation, and the training corpus is full of complete-looking, production-shaped code: config layers, retries, logging, tests. So the model regresses to the verbose mean. It builds the cathedral because cathedrals are what "good code" looks like in the data, not because the problem called for one. It solves problems you don't have, in a scope you never set, and hands you back code you can't explain the reason for.

This is just Fred Brooks, updated. In 1986 he split software complexity into two kinds. Accidental complexity is the stuff that comes from our tools, like syntax and boilerplate and glue. Essential complexity is understanding the problem and specifying its conceptual structure. His argument, the one everyone remembers as "no silver bullet," was that no single tool delivers an order-of-magnitude gain, because tools can only attack the accidental, and even in 1986 the accidental was already the smaller half.

Agents are the most powerful accidental-complexity killer ever built. They've all but eliminated the cost of typing. They look like the silver bullet. But Brooks already explained why they aren't: killing the accidental doesn't shrink the cost of software, it relocates it. By demolishing the accidental, agents push essential complexity from "most of the job" to "the entire job." And essential complexity has always lived in one place: a human head, trying to hold a working model of the system.

The Bottleneck

So measure the right thing. The scarce resource in software was never how fast you can produce code. It's how fast a person can understand it: validate it, integrate it, keep a model of it in their head while making the next safe change. Agents made the first number explode and left the second almost exactly where it was.

This isn't a thought experiment anymore. The 2025 DORA report, drawing on nearly 5,000 engineers, found that AI adoption tracks with higher throughput and, at the same time, greater delivery instability. The researchers' read is blunt: teams have tuned for speed, but the systems underneath them haven't changed enough to absorb that speed safely. The production layer got faster, the comprehension layer didn't, and the difference is escaping as instability.

None of this is new in kind. Manny Lehman observed in the 1970s that a living software system's complexity rises over time unless work is explicitly spent reducing it, a law he himself suspected was an analogue of the second law of thermodynamics. Disorder is the default. Order is the thing you pay for, continuously. Agents didn't repeal that law. They raised the temperature.

Two quantities over time. One line is how much code exists; it explodes when agents arrive. The other is how much a single engineer can hold in their head, and it barely moves. The shaded wedge between them, code that exists but nobody is holding, is everything we file under "AI technical debt." The wedge is the whole problem. Curves are illustrative, not measured.

Once you accept that comprehension is the binding constraint, the rest of your architecture stops being a matter of taste and starts being a matter of arithmetic.

The Boundary

If comprehension is the binding constraint, you don't relieve it by deploying more services. You relieve it by capping the surface area any one person has to hold in their head at once. That sounds soft until you look at what connects to what.

Take a system of components that can all reach each other. The number of possible interactions between them scales like n(n−1)/2, which is quadratic. Add one configuration flag and you double the reachable state space. This is the real reason verbose agent output is dangerous, and it has nothing to do with line count. A generated subsystem doesn't just add code linearly. It raises the exponent. It looks locally correct (its tests pass) while quietly multiplying the number of global states the system can be in.

That distinction is the one agents are worst at:

Local correctness: this function does what its tests say. Agents are excellent at this.
Global maintainability: this addition doesn't explode the system's possible behaviors. Agents have no idea, because they were never holding the whole system in mind to begin with. An agent can prove the brick is square. It can't tell you whether the wall needed another brick.

The defense is a hard cap on the number of things that can touch each other, and the way you cap it is older than any of this. David Parnas argued in 1972 that a module should hide its design decisions behind a stable interface, so the rest of the system can ignore everything inside it. A narrow interface means the outside world sees only a few things, no matter how much hides behind it. That fifty-year-old idea just became the most important tool you have, because it's the only thing that lets an agent over-build inside a box without the over-building leaking into the global state you have to reason about.

The same components, wired two ways. Toggle to Monolith and everything can reach everything: a hairball where you can't trace a single connection by eye. Toggle to Bounded modules and the cross-cluster wiring collapses to a handful of legible edges, because each interface caps what the rest of the system can see. Thin surface area isn't taste. It's a cap on the exponent.

Now, the word "microservices" is right there, and I want to head it off, because it conjures Kubernetes sprawl, network hops, and distributed-tracing misery, and that's not the point. The point is surface-area control, and you can get almost all of it without a single network call: tiny libraries, narrow packages, bounded modules inside a perfectly ordinary monolith, one-off internal tools with a single clear input and output. The unit that matters isn't the deployment artifact.

The AI-era unit of architecture is not the service. It is the boundary.

The Compressor

Pull the threads together and they all run through the same place: the human.

An agent is a decompressor. It takes a compact intent (say, "fix this bug" or "add telemetry") and expands it into verbose implementation, complete with the costume of engineering maturity: abstractions, options, defensive layers, docs, tests. The engineer's job is the inverse function. Compress the generated sprawl back down to the smallest thing that preserves the behavior you actually need. And here's the non-negotiable part: keep a working model of the result in your head. The moment you say "I don't know how this part works, but the agent made it pass," the rot has already won, because you've given up the one thing that was scarce.

This reframes seniority. The old leverage was I can implement this. The new leverage is I can decide what should exist, constrain it, verify it, and delete the rest. Deletion becomes a senior skill. Saying "no, we don't need this" becomes more valuable than producing the thing, because producing the thing is free now and producing the right small thing is not.

It also gives you a discipline for the flood of cheap software agents make possible, which is real and worth embracing, as long as you're honest about what each artifact is. Sort generated code into three buckets:

Ephemeral: generated, used once, deleted. A migration, a data cleanup, a throwaway benchmark. Perfectly good. The only rule is that it actually gets deleted.
Tool: small, tested, owned, documented. A narrow utility with a clear contract and no hidden state.
Product: reviewed, architected, monitored, maintained. The stuff you've committed to keeping alive.

The failure mode, the one behind every "nobody knows how this works" horror story, is ephemeral code silently graduating to product code without anyone deciding it should. A one-off script that quietly becomes load-bearing infrastructure isn't a tool. It's a fossil. Give generated artifacts an owner, an expiration, or an explicit promotion path, and the flood becomes an asset instead of a slow-motion landslide.

The Cache

There's a habit worth examining. Engineers get attached to implementations. A module accumulates history (refactors, bug fixes, hard-won optimizations) and the impulse to protect it is understandable. That attachment made sense when code was expensive to produce. It makes considerably less sense when code is nearly free.

Lehman drew a second line, beyond the entropy law: the difference between programs derivable from a specification and programs embedded in the messy real world that must keep changing. The second kind rots. Almost all real software is the second kind, and the traditional defense is maintenance and good architecture.

But if code is cheap to regenerate from a spec and a test suite, the durable artifacts stop being the implementation and become the boundary and the tests. The implementation becomes a cache: something you can blow away and re-derive when the spec changes, when a better model arrives, or when you want to try a different approach without committing to it. Agents quietly push the rotting kind of code back toward the derivable end.

The right posture is detachment. Swap implementations freely. Treat code as provisional. The thing worth protecting isn't the specific lines; it's the contract those lines fulfill and the model in your head of what the system does. Those are the things that are hard to regenerate. The more behavior you can pin behind contracts and tests, the less precious any one implementation becomes. The rest is cached derivation.

The thing you maintain is no longer the code. It's the contract the code is generated from.

The agent wrote the code. You still own the state space.