Once it was simple: your entire application ran in a single process. One codebase, one deployment, everything in the same memory. The monolith. Not sexy, but manageable.
Microservices promised to do better. Separate services. Independent deployments. API gateways. Kubernetes. Service mesh. Everything neatly split up. Teams autonomous.
Until you look at runtime behavior.
Then you find that a simple user action depends on fourteen services, three databases, two queues, an external identity provider, and a caching layer that quietly makes the difference between 200ms and 12 seconds.
The monolith hasn’t disappeared. It’s just now distributed across the network. That only makes the coupling harder to see.
The illusion of independence
Microservices are sold as independently deployable components. In theory that’s true. And yes — many teams use message brokers and event-driven patterns to decouple services. That helps.
But look at the path a user actually hits: the API call that expects a response. There, async is rarely an option. Async messaging reduces runtime coupling, but doesn’t remove business coupling. A user placing an order wants a confirmation — not “we’ll let you know later.”
And it is exactly on those paths that I see the same thing emerge in virtually every project: Service A waits synchronously on B for authorization. B checks with C for pricing. C fetches inventory from D. Meanwhile the user is still waiting.
On paper those are four separate services. At runtime it’s one chain.
And chains have an unpleasant property: they are only as strong as their weakest link.
A network call is not a function call
This difference is structurally underestimated.
A function call within a monolith is predictable: memory is local, latency is microseconds, hardly any external factors. A network call introduces a completely different failure model: latency, timeouts, retries, DNS issues, TLS handshakes, connection pooling, transient failures.
What cost milliseconds locally suddenly becomes thread exhaustion and cascading failures under load. Not because the code is bad, but because the network doesn’t behave like memory — and your architecture acts as if it does.
And the insidious part: that degradation rarely progresses linearly. It creates queues, retries, resource contention, and backpressure in every system that depends on it. The math behind it is simple but unrelenting — from queueing theory we know that wait times grow exponentially with load. A service at 50% utilization has a wait time equal to its processing time. At 90% it’s nine times as long. At 95% it’s nineteen times.
That means a service that becomes 300ms slower doesn’t cause 300ms of extra latency. It pushes itself and everything that depends on it past a tipping point. Queues fill up, retries pile up, threads get exhausted. One slow link can drag down ten healthy services. And the actual source is almost never where the alarm goes off.
That is exactly what the Fallacies of Distributed Computing are about. And yet I see them come back every time, wrapped in a clean architecture diagram.
How a distributed monolith emerges
It rarely comes from bad engineers. It emerges because teams split functionality along logical lines — users, orders, inventory, payments, notifications — while the underlying business processes remain tightly coupled.
Placing an order still requires: authentication, inventory check, price calculation, payment, and notification. Only those dependencies now run over HTTP or gRPC instead of in-process.
The coupling doesn’t disappear. It becomes harder to see, harder to debug, and more sensitive to latency.
Observability is no longer a luxury
In a monolith you can still debug linearly. A stack trace tells you reasonably well where things go wrong. With microservices that luxury doesn’t exist.
An error message in Service A can be caused by a timeout in C, slow storage in D, backpressure in a queue, or a dependency that is “half broken” — just working enough not to trigger an alarm, but too slow to keep the system healthy.
Distributed tracing, correlation IDs, and structured logging are not optional here. They are the difference between “we understand our own system” and “we deploy and hope for the best.”
More services ≠better architecture
Microservices solve real problems: team autonomy, independent deployments, targeted scalability, fault isolation.
But they introduce at least as much complexity: network behavior, versioning, deployment coordination, operational dependencies, and runtime instability.
That trade-off is made too rarely explicit. The question shouldn’t be “how do we split up the application?”, but “does our scale and team size justify this operational complexity?” For many organizations the honest answer is: no.
What you can do about it
Recognizing a distributed monolith is step one. But recognizing it without acting only produces cynicism. A few patterns that I see working effectively in practice:
Make dependencies asynchronous where possible. Not every request needs to be handled synchronously. Event-driven patterns with a message broker decouple services in time — and therefore in availability. An order doesn’t need to wait until the notification has been sent.
Design for failure, not just for success. Circuit breakers, bulkheads, timeouts, and fallbacks are not optimizations. They are basic architecture the moment you communicate over a network.
Dare to redraw boundaries. If two services always have to deploy together, always fail together, and always wait on each other synchronously — then they are not two services. Then it’s one service with a network call in between. Draw the conclusion and merge them.
Invest in observability before you need it. Adding distributed tracing and structured logging after the fact to a landscape of thirty services is a nightmare. Start with service two.
The real distinction
Good distributed architecture doesn’t distinguish itself when everything works. It distinguishes itself when parts of the system no longer do.
The question is not whether something fails. At sufficient scale something always fails. The question is whether your system is designed to handle that — or whether one slow database brings the entire landscape down.
That is the difference between microservices as an architectural choice and microservices as hype. And the honest conversation about that starts with daring to look at runtime behavior instead of architecture diagrams.