Chapter 16  ·  Part IV — Consistency and Correctness

The Consistency Model Landscape

What does it actually mean for a distributed system to be "consistent"? The word is used five different ways in five different conversations. This chapter gives each meaning a precise name — and tells you what each one costs.

What's Coming in This Chapter

Key Learnings — Read This If You're Short on Time

Linearizable

The gold standard. Behaves as if the whole system were a single machine. Once a write completes, every subsequent read anywhere in the system returns that new value. Very expensive — requires synchronous coordination across replicas. Most databases do not give you this by default.

Sequential

Weaker than linearizable. All operations appear to happen in some sequential order, and each client's operations appear in the order they were issued. But two different clients can see each other's operations in a different order. Cheaper than linearizable — no global wall-clock ordering required.

Causal

The sweet spot. If operation A causally influenced operation B, then everyone sees A before B. Unrelated (concurrent) operations can be seen in any order. Implementable without cross-datacenter round trips. Most applications — including most social products — only actually need this.

Eventual

The weakest guarantee that's still useful. If writes stop, all replicas will converge to the same value — eventually. Makes no promises about when, or what you read in the meantime. Enables maximum availability and partition tolerance. DynamoDB, Cassandra, DNS all use this model.

Session

The practical middle ground. Four session guarantees — read-your-writes, monotonic reads, monotonic writes, writes-follow-reads — can be layered on top of eventual consistency to give each individual user a sensible experience, without the cost of global strong consistency.

Key Insight

Most teams pick a model by accident, not by design. They use their database's default and hope for the best. The right approach is to identify the anomalies your application cannot tolerate, then choose the weakest model that prevents those anomalies.

The Word "Consistent" Is Broken

If you ask ten engineers what "consistency" means, you will get several different answers — and most of them will be right, just about different things. The word is used in at least four distinct ways in distributed systems:

In this chapter, we are talking about the last three — the read/write consistency models for distributed storage systems. These are the models that determine what a client is allowed to see when it reads from a distributed system. When engineers argue about whether their system is "consistent," they are almost always arguing at cross-purposes because they haven't agreed which model they're talking about.

The first step to choosing the right consistency model is having a precise vocabulary. So let's build one.

The Consistency Spectrum — Strongest to Weakest

Linearizable Sequential Causal Eventual
← Strong More guarantees, higher cost, lower availability under network partitions
Weak → Fewer guarantees, lower cost, higher availability under network partitions

Moving left on this spectrum buys you more guarantees — your application is easier to reason about. Moving right buys you performance and availability. Neither end is universally better. The right place on the spectrum depends entirely on what anomalies your application can and cannot tolerate.

Linearizability — The Gold Standard

Linearizability is the strongest consistency model we commonly talk about. The idea is simple to state: a linearizable system behaves as if there is only one copy of the data, and all operations happen atomically at a single point in time.

More precisely: if a write completes before a read starts, that read must see the written value (or something even newer). There is no situation where a client reads a value that was overwritten before the read began.

Linearizability
Cost: High Also called: Atomic Consistency
What it guarantees
  • Once a write completes, all reads everywhere see it
  • Operations appear to happen at a single instant in time
  • The system looks like one machine to every client
Real-world uses
  • Distributed locks and leader election
  • Unique username / seat reservation systems
  • Increment counters without double-counting
  • ZooKeeper, etcd, Consul

A useful mental image: imagine all operations in the system are placed on a single timeline, like beads on a string. Each operation occupies exactly one point on that string. Reads always return the value of the most recent write to the left of them on that string. That's linearizability.

The Recency Guarantee

Another way to think about it: linearizability gives you a recency guarantee. Once a write is acknowledged, any client anywhere in the world that reads the same key will see that write. Not eventually — immediately. This is the "single machine" feeling.

Example — Seat Reservation

An airline seat is available. Two users click "book" at nearly the same time on different servers. With linearizability, exactly one of them gets the seat. The other gets a "seat no longer available" error. Without it, both could read "available," both could write "booked," and you oversell the flight.

Why Linearizability Is Expensive

Achieving linearizability in a distributed system requires coordination. When a write happens on replica A, replica B must know about it before it can serve a read. This means synchronous cross-replica communication before acknowledging writes.

In a single data center, this coordination takes microseconds to low milliseconds — usually acceptable. But in a geo-distributed system (say, US-East and US-West, or London and Singapore), a synchronous round trip can take 80–200ms. If your write must wait for acknowledgment from all replicas before returning, your write latency floor becomes your inter-datacenter round-trip time. This is often completely unacceptable.

The CAP Connection

CAP theorem says: under a network partition, you must choose between Consistency (linearizability) and Availability. A linearizable system that can't reach its other replicas must refuse to serve reads and writes — otherwise it risks returning stale data, which would violate linearizability. This is why ZooKeeper and etcd are unavailable when they lose quorum. That's a deliberate trade-off, not a bug.

What Databases Actually Give You

Most databases do not provide linearizability by default, even if their marketing says otherwise. Here's a reality check:

Sequential Consistency — A Useful Step Down

Sequential consistency is slightly weaker than linearizability. The difference is subtle but important.

Sequential Consistency
Cost: Medium
What it guarantees
  • All operations appear to happen in some sequential order
  • Each client's operations appear in the order they issued them
  • But no promise about real-time ordering across clients
Where you see this
  • CPU memory models (x86 TSO)
  • Some database "snapshot isolation" modes
  • Certain distributed queue implementations

With linearizability, the global order of operations must match real time — if write W completes before read R starts on the wall clock, R must see W. Sequential consistency drops this real-time requirement. The operations must appear in some consistent sequential order that all nodes agree on, and each client's operations must appear in program order — but that order doesn't have to match the actual timestamps.

The Practical Difference

Imagine Client A writes X=1, then Client B reads X. With linearizability, B must see 1. With sequential consistency, B might see 0 — as long as every node agrees on the same order of all operations. The global order is consistent, but it may not match the wall clock. This is confusing in practice, which is why sequential consistency is discussed more in academic literature than in production systems design.

Causal Consistency — The Sweet Spot

Causal consistency is the model that most application developers actually want when they ask for "consistency." It is weaker than sequential consistency, but it captures the thing humans actually care about: if one thing caused another, everyone should see them in that order.

Causal Consistency
Cost: Low–Medium Best practical trade-off
What it guarantees
  • Causally related operations appear in the same order everywhere
  • "If you saw A, and then did B, everyone who sees B has also seen A"
  • Concurrent (unrelated) operations can be seen in any order
Real-world analogy
  • You can't see a reply to a comment before you see the comment
  • You can't see "account overdrawn" before seeing the withdrawal
  • Two independent posts can appear in any order

What "Causally Related" Means

Two operations are causally related if one could have influenced the other. Specifically:

This is tracked using vector clocks or version vectors — a data structure where each node keeps a counter for every other node it has talked to. When a write is sent, the vector clock goes with it. A replica knows whether it has seen all the causal predecessors of an operation before applying it.

Example — Comment Thread

Alice posts a comment. Bob reads Alice's comment, then replies to it. With causal consistency: anyone who sees Bob's reply has already seen Alice's comment. Without it: you could see Bob's reply ("Great point!") before you see what he was replying to. Instagram, Twitter, and most comment systems only need this — not full linearizability.

Why Causal Consistency Is Cheaper

Causal consistency does not require global coordination. A replica can serve reads immediately from its local state, as long as it has applied all causally preceding writes. Two replicas on different continents can process writes independently as long as causally related writes are applied in the right order. This means no synchronous cross-datacenter round trips — reads and writes can be served locally, with low latency.

The COPS Paper

A 2011 paper called COPS (Clusters of Order-Preserving Servers) from CMU showed you could build a key-value store that provides causal+ consistency (causal consistency plus convergent conflict handling) while keeping read and write latency local to each datacenter. This is the theoretical foundation for why causal consistency is achievable at global scale in a way that linearizability isn't.

Eventual Consistency — The Most Misunderstood Model

Eventual consistency is often described as "the database will catch up eventually" and left at that. This creates a false impression that it's almost-as-good-as-strong consistency, just with a bit of lag. That is wrong. Eventual consistency makes very few promises.

Eventual Consistency
Cost: Lowest Maximum availability
What it guarantees
  • If no new writes occur, all replicas will eventually converge to the same value
  • No timing promise on "eventually"
  • No promise about what you read in the meantime
Where it's appropriate
  • DNS record propagation
  • Shopping cart contents (merge, don't overwrite)
  • User preference/settings (last write wins is usually fine)
  • Aggregate counters (view counts, like counts)

The Anomalies Eventual Consistency Allows

This is the part that gets teams into trouble. Because eventual consistency makes almost no ordering guarantees, a client reading from an eventually consistent store can observe some deeply counterintuitive things:

The "Eventually" Problem

"Eventually" has no time bound. In a healthy system with low replication lag, replicas may converge in milliseconds. But under heavy write load, network issues, or a slow replica, "eventually" can be seconds, minutes, or longer. Systems that depend on eventual consistency must be designed to function correctly during divergence, not just after convergence.

Conflict Resolution: You Must Think About This

When two replicas accept concurrent writes to the same key, they will eventually need to reconcile them. Eventual consistency does not tell you how — that's up to you or the database. Common strategies:

Session Guarantees — The Practical Layer

Here's the practical reality: most applications can't live with the full weakness of eventual consistency (read-your-writes violations are especially jarring for users), but they also don't need the full strength of linearizability. They need something in between — but not causal consistency over all clients globally, just sensible behavior for one user's session.

Session guarantees are exactly that. They are four properties that can be layered on top of an eventually consistent store to give each individual client session a coherent view, without the global coordination cost of strong consistency.

Session Guarantees
Cost: Low Per-client coherence
Read Your Writes

After you perform a write, any subsequent read in your session will return that value or something newer. You will never see a stale version of data you yourself just wrote. Implemented by routing all reads to the replica that accepted your write, or tracking a "write token" and waiting until the read replica has caught up.

Monotonic Reads

If you read a value at time T, any subsequent read will return a value at least as recent. Time never goes backwards for a given client. Implemented by tagging reads with a version, and routing subsequent reads to replicas that are at least as up-to-date.

Monotonic Writes

Writes from the same client are applied in the order they were issued. If you write A then B, no replica will apply B before A. Implemented by sequencing writes from each client.

Writes Follow Reads

If you read a value and then write, your write is causally after the value you read. Any replica that has your write has also applied the read that preceded it. This is the weakest form of causal tracking.

Read-your-writes is the most important session guarantee in practice. Users who just saved a form and then refreshed the page to find their change gone are having a bad experience. Most major web frameworks and cloud databases (DynamoDB with consistent read options, MongoDB sessions, Cosmos DB session tokens) provide read-your-writes within a session.

How Read-Your-Writes Is Implemented in Practice

After a write, the database returns a "write token" — essentially a version number or replica ID. The client sends this token with subsequent reads. If the read lands on a replica that hasn't applied that write yet, it either (a) waits until it catches up, (b) forwards the read to a replica that has, or (c) serves a consistent read from the primary. This adds a little overhead but avoids the jarring experience of reading stale data you just wrote.

Comparing the Models Side by Side

The table below shows which anomalies each model prevents. A tick means the anomaly is prevented (the model is safe from it). A cross means the anomaly is possible. An application that cannot tolerate a given anomaly needs a model with a tick for that row.

Anomaly Linearizable Sequential Causal Session (RYW) Eventual
Stale reads ~ ~
Read-your-writes violation
Causally inconsistent reads
Lost updates (concurrent writes)
Write skew
Monotonic read violations

✓ anomaly prevented  ·  ✗ anomaly possible  ·  ~ depends on scope (per-client vs. global)

Notice that lost updates and write skew are only prevented by linearizability. This is the key insight: if your application has concurrent writers that need to coordinate (think seat booking, inventory management, bank transfers), you need linearizability — or a database-level transaction that provides it. A weaker consistency model will not save you.

A Concrete Example: The User Profile Service

Let's ground these abstractions in a real scenario. Imagine a user updates their profile picture on a social platform. Several things need to happen:

// User profile update flow 1. User uploads new photo → stored in object storage 2. Profile record updated: { user_id: 123, avatar_url: "new_url" } 3. User is redirected back to their profile page 4. Profile page fetches user record and renders avatar // The question: what does read at step 4 return?

With linearizability: the read at step 4 always returns the new avatar. Guaranteed.

With eventual consistency: the read at step 4 might return the old avatar. The user just sees their profile looking unchanged. They wonder if the upload worked. They upload it again. Now you have two copies and a confused user.

With read-your-writes (session guarantee): the user always sees their own new avatar, because their session is pinned to a replica that has their write. Other users might briefly see the old avatar — that's fine, nobody else cares as much as the person who just changed it.

For this scenario, read-your-writes is the right answer. Full linearizability would add unnecessary cost. Pure eventual consistency would be user-hostile. The session guarantee is the minimum viable consistency for a good user experience.

How to Choose: A Decision Guide

The way to pick a consistency model is not to start with "what does our database offer?" — it's to start with "what anomalies can we not tolerate?"

Need a distributed lock or leader election Two processes must not both believe they hold the lock
Linearizable
Unique constraint across distributed nodes Unique usernames, booking the last seat, unique order IDs
Linearizable
Thread replies must appear after their parent post Social feeds, comment threads, message history
Causal
User always sees their own writes Profile updates, settings changes, form saves
Session (RYW)
Aggregate counter (views, likes, plays) Exact real-time accuracy not required
Eventual
DNS record propagation Best-effort, convergence over minutes is fine
Eventual
User preferences and settings (single writer per user) One user's changes, no concurrent write conflict possible
Session
Financial account balance, inventory count Concurrent writers, cannot lose an update
Linearizable
The Key Insight

Most web applications have a mix of requirements. The user's own profile needs read-your-writes. The activity feed needs causal consistency. The like counter is fine with eventual. The "last one available" hotel room needs linearizability. You don't have to pick one model for your entire system — you can and should use different models for different data, at the cost of added complexity in your data layer.

The Mistake Most Teams Make

The single most common mistake is picking a consistency model by default — using whatever the database ships with — and only discovering the anomalies when they hit production.

Cassandra defaults to eventual consistency. Engineers who use it as a drop-in replacement for a relational database start seeing weird behavior: counters not incrementing correctly, deletes coming back to life (the "zombie read" problem), profile updates reverting. These are not Cassandra bugs — they are the exact anomalies that eventual consistency permits. The mistake was applying a weakly consistent store to a use case that needed stronger guarantees.

The reverse mistake also happens: reaching for linearizability (or a fully serializable SQL transaction) for something that only needs session-level guarantees. This works correctly but adds unnecessary latency and creates a coordination bottleneck that limits how much the system can scale.

The discipline is to always explicitly state: for this piece of data, under this set of operations, what anomalies would break the application? Then choose the weakest model that prevents those anomalies. "Weakest sufficient" is the goal.

The Principle

Choose the weakest consistency model that prevents the anomalies your application cannot tolerate — not the strongest one your database supports.

The Most Common Mistake

Using a database's default consistency mode without examining whether it actually prevents the anomalies your application depends on — and discovering the gap in production rather than in design.

Three Questions for Your Next Design Review

  • For each key type of data in this system: which anomalies from the table above would cause visible bugs or data loss if they occurred?
  • What consistency model does our chosen database actually provide by default — and have we verified this against the official documentation, not just marketing copy?
  • Where we need linearizability: have we measured the latency cost of synchronous replication under realistic load, especially across availability zones?