Idempotency
The Superpower
Networks fail. Servers crash. Clients retry. In a distributed system, you will never fully prevent a request from being sent more than once. What you can do is make sure that sending it twice has the same effect as sending it once. That property is called idempotency, and it is the single most important tool you have for building correct distributed systems.
What's Coming in This Chapter
- Why retries are unavoidable, and what goes wrong without idempotency
- A precise definition of idempotency — not the loose one you've heard before
- The three delivery guarantees: at-most-once, at-least-once, exactly-once
- Why exactly-once is not what most people think it is
- Idempotency keys: how to design and store them correctly
- The deduplication window and what happens when it closes
- How to make naturally non-idempotent operations safe to retry
- Concrete patterns for payments, emails, database writes, and message queues
- The most common mistakes teams make, and how to avoid them
Key Learnings
Short on time? Read this section. It captures the most important ideas in the chapter.
Idempotency means: calling an operation multiple times produces the same result as calling it once. It does not mean the operation has no side effects. It means repeated calls do not produce additional side effects beyond the first.
Exactly-once delivery does not exist at the network level. The network can never guarantee a message was delivered exactly once. What you can achieve is at-least-once delivery + idempotent processing, which gives you the same practical result.
Idempotency keys are the primary implementation tool. The caller generates a unique key per logical operation. The server stores that key and uses it to detect and discard duplicate requests, returning the same response as the first call.
The deduplication window is finite. You cannot store idempotency keys forever. When a key expires, old retries are treated as new requests. Your deduplication window must be longer than your client's longest retry cycle.
Generate idempotency keys as early as possible — at the edge, not in the middle of a call stack. Once an operation is partially complete, it is too late to assign an idempotency key. The key must travel with the request from the very first attempt.
Database upserts, conditional writes, and natural keys are your friends. Many operations can be made idempotent by design, without an explicit idempotency key mechanism. INSERT ... ON CONFLICT DO NOTHING is often all you need.
At-most-once is not the safe default. Teams often choose at-most-once to avoid duplicates, but it means operations silently disappear when the network fails. For most systems, at-least-once + idempotency is the correct tradeoff.
Idempotency and atomicity are different problems. Idempotency protects against duplicate execution. Atomicity protects against partial execution. You usually need both. A retry-safe operation can still leave your system in a half-updated state if it's not atomic.
The Problem With Retries
Picture a simple scenario. A mobile app sends a request to charge a customer $50. The server receives the request, charges the card successfully, but then the server crashes before it can send back a response. The app never gets a reply. What does it do?
If the app does nothing, the customer is charged but never sees a confirmation. Bad. If the app retries the request, the customer might be charged twice. Also bad. There is no way to know, from the client's perspective, whether the first request succeeded.
This is not an edge case. It is the fundamental problem with remote calls. A network request has three possible outcomes: success, failure, or unknown. The third outcome — where the request was sent but no reply came back — is the one that causes real problems. And it happens constantly, at any scale.
The instinct is to avoid retries. But that instinct is wrong. Retries are not the problem — retries without idempotency are the problem. In a distributed system, some form of retry is unavoidable. Load balancers retry. Message queues deliver at least once. Client libraries retry on timeout. If your server is not designed to handle duplicate requests safely, you are building on a broken foundation.
Even if your application code does not retry, your infrastructure might. An HTTP load balancer may replay a request when a backend node goes down. A Kubernetes pod restart may cause an in-flight request to be resent. A TCP connection reset may trigger a reconnect-and-resend. Idempotency protects you from all of these, not just the retries you know about.
What can go wrong without idempotency
Let's make the failure modes concrete. Suppose your service processes payments, sends order confirmation emails, and creates records in a database. Without idempotency:
- Payments: A customer gets charged twice for the same order. The first charge succeeded but the response was lost. The client retried. Now you have a very unhappy customer and a support ticket.
- Emails: A user gets five welcome emails because the message queue delivered the same event five times before your consumer acknowledged it.
- Database writes: A duplicate row is created for the same order. Your "unique" order ID appears twice. Your reporting is wrong. Your inventory count is off.
None of these failures require a dramatic outage. They happen under normal operating conditions — a slow network, a brief server hiccup, a consumer restart. Without idempotency, the system is subtly wrong in ways that are hard to detect and expensive to fix.
What Idempotency Actually Means
The word comes from mathematics. An operation f is idempotent if applying it multiple times has the same effect as applying it once: f(f(x)) = f(x). In computer science, we use it slightly differently.
An operation is idempotent if performing it multiple times has the same observable effect as performing it exactly once.
The key word is observable effect. The operation may execute many times internally. The point is that the state of the system — the thing the caller cares about — is the same regardless of how many times the call was made.
Some operations are naturally idempotent:
- Setting a value:
SET price = 50can run ten times; the price is always 50. - Deleting something that doesn't exist: a no-op either way.
- Fetching data: a
GETrequest never changes state.
Other operations are naturally not idempotent:
- Incrementing a counter:
counter += 1run three times gives a different result than run once. - Appending to a list: you get duplicate entries.
- Charging a credit card: the bank runs the charge each time.
- Sending an email: the recipient gets one per call.
The goal of this chapter is to show you how to take the naturally non-idempotent operations and make them safe to retry.
In HTTP terms, a "safe" method is one that has no side effects at all (GET, HEAD). An "idempotent" method may have side effects, but repeating it has no additional effect (PUT, DELETE). POST is neither safe nor idempotent by default — which is why POST is the HTTP method most in need of an idempotency strategy.
Delivery Guarantees: The Three Options
When a message or request is sent across a network, there are three guarantees a system can offer about how many times a receiver will process it. Understanding these precisely is essential — because the right design choice depends on knowing exactly what you're trading.
At-Most-Once: Fire and Forget
The sender sends the message. It does not retry. If the message is lost, it is lost. The receiver processes it zero or one times.
At-most-once is simple. The downside is that operations can silently disappear. The system has no way to tell the caller whether anything happened. For most business-critical operations — payment, order creation, state mutation — silent failure is unacceptable.
At-most-once is appropriate for situations where the cost of processing twice is catastrophically higher than the cost of not processing at all, and where the caller does not need confirmation. Think of metrics counters or non-critical analytics events. You would rather drop a few data points than double-charge a customer.
At-Least-Once: Retry Until Acknowledged
The sender retries until it gets a confirmation. The receiver may process the same message more than once. The operation runs one or more times.
At-least-once is the delivery guarantee used by almost every message queue (Kafka, SQS, RabbitMQ) and by most HTTP retry logic. It ensures operations are not silently dropped. But it means you must design your processing to be idempotent — because the same message will eventually arrive more than once.
At-least-once + idempotent processing is the combination that powers the vast majority of reliable distributed systems.
Exactly-Once: The Illusion and the Reality
Exactly-once sounds like the obvious answer. Send the message, process it once, done. The problem is that exactly-once delivery is impossible to guarantee at the network level.
Here is why. When the server processes a message and the acknowledgment gets lost, the client has no way to know whether processing happened. From the client's point of view, the request vanished. The client must choose: retry (risk processing twice) or give up (risk not processing at all). There is no third option that guarantees exactly one execution without cooperation from both sides.
When a vendor says their message queue delivers "exactly-once," read the fine print. What they mean is: the system makes it appear as though exactly-once delivery happened, by using at-least-once delivery plus idempotent deduplication on the consumer side. The network is still unreliable. The magic happens in the deduplication layer.
Kafka's "exactly-once semantics" (introduced in version 0.11) is a good example. Under the hood, it uses a producer ID and sequence number. The broker detects duplicate writes using these identifiers and discards them. The consumer processes each logical message once because duplicates never reach it. But the network still retries. The "exactly-once" is an abstraction built on top of at-least-once.
The practical lesson: do not chase exactly-once delivery. Build at-least-once delivery and make your operations idempotent. The end result is the same, and it works across every transport layer — HTTP, queues, RPC frameworks, all of them.
| Guarantee | Possible Executions | Use When | Risk |
|---|---|---|---|
| At-most-once | 0 or 1 | Non-critical events, metrics, best-effort analytics | Silent data loss |
| At-least-once | 1 or more | Critical operations with idempotent processing | Duplicate processing (handled by idempotency) |
| Exactly-once | Exactly 1 | Not achievable purely at transport layer | Achieved via at-least-once + deduplication |
Idempotency Keys
The most general-purpose tool for making an operation idempotent is the idempotency key. The idea is simple: the caller attaches a unique identifier to each logical operation. The server stores that identifier. If the same identifier arrives again, the server returns the result of the original execution instead of running the operation again.
An idempotency key answers the question: "Have I seen this exact logical request before?" It is distinct from a request ID (which identifies a single HTTP connection) or a transaction ID (which identifies a database transaction). An idempotency key identifies a logical operation that the caller wants to happen exactly once, regardless of how many times the physical request is sent.
From the client's perspective, both attempts returned the same result. The customer was charged once. The idempotency key made the retry safe.
Where to Generate Idempotency Keys
The key must be generated by the caller, before the first attempt. This is a rule worth memorizing.
If the key is generated by the server, it is useless — the client cannot include it in the first request, so the first request is already unprotected. And if the first request is unprotected, you have no safety against the scenario where the first response is lost.
Idempotency keys are almost always UUIDs (v4). They should be:
- Globally unique — across all clients and all time, not just within one session.
- Generated once per logical operation — not regenerated on retry. If the client regenerates the key on each retry, it defeats the purpose entirely.
- Stored by the client until the operation is confirmed — so that if the client itself crashes and restarts, it can resume with the same key.
This is by far the most common implementation error. The developer generates a UUID, sends the request, gets a timeout, generates a new UUID for the retry. Now the server sees two different keys and processes both. You get double charges, duplicate records, or duplicate emails — exactly the problem you were trying to prevent.
In practice, this means idempotency key storage needs to survive client restarts. For a mobile app, store it in local persistent storage before making the first call. For a backend service, store it in your database as part of the same transaction that initiates the logical operation.
// WRONG: new key on every attempt
function chargeWithRetry(amount) {
for (let attempt = 0; attempt < 3; attempt++) {
const key = generateUUID(); // BUG: new key each time
await charge(amount, key);
}
}
// CORRECT: same key for all attempts of one logical operation
function chargeWithRetry(amount) {
const key = generateUUID(); // generated once, before any attempts
for (let attempt = 0; attempt < 3; attempt++) {
await charge(amount, key); // same key on every retry
}
}
How to Store Idempotency Keys on the Server
The server needs to check whether it has seen a key before and store the result atomically. There are two challenges here: the check-and-store must be atomic, and you need to handle the "in-flight" case — where a request with a given key is currently being processed.
The Phases of an Idempotent Request
When a request with an idempotency key arrives at your server, it goes through three phases:
- Not seen before: Insert the key into storage with a "processing" status and execute the operation.
- In flight: The key exists but status is "processing." Another thread is currently working on it. Return a "retry later" or wait.
- Already completed: The key exists with a "completed" status and a stored response. Return the stored response immediately without re-executing.
-- Idempotency keys table
CREATE TABLE idempotency_keys (
key TEXT PRIMARY KEY,
status TEXT NOT NULL, -- 'processing' | 'completed' | 'failed'
request_hash TEXT, -- hash of request body (for mismatch detection)
response JSONB, -- stored response to return on replay
created_at TIMESTAMPTZ NOT NULL,
expires_at TIMESTAMPTZ NOT NULL -- when this key can be garbage collected
);
The critical detail: the initial insert of the key with "processing" status must happen atomically, before any side effects are executed. This way, if the operation fails halfway through, the key is still in "processing" state — preventing a concurrent retry from starting a parallel execution.
async function handlePayment(idempotencyKey, amount) {
// Step 1: Try to claim the key atomically
const existing = await db.transaction(async (tx) => {
const row = await tx.query(
`INSERT INTO idempotency_keys (key, status, created_at, expires_at)
VALUES ($1, 'processing', NOW(), NOW() + INTERVAL '24 hours')
ON CONFLICT (key) DO UPDATE SET key = EXCLUDED.key
RETURNING *`,
[idempotencyKey]
);
return row;
});
// Step 2: Handle each phase
if (existing.status === 'completed') {
return existing.response; // return stored result, do nothing
}
if (existing.status === 'processing' && !weJustInsertedIt) {
throw new ConflictError('Request in progress, retry shortly');
}
// Step 3: Execute the operation
try {
const charge = await paymentProvider.charge(amount);
// Step 4: Store the result atomically with the key update
await db.query(
`UPDATE idempotency_keys
SET status = 'completed', response = $1
WHERE key = $2`,
[{ chargeId: charge.id }, idempotencyKey]
);
return { chargeId: charge.id };
} catch (err) {
await db.query(
`UPDATE idempotency_keys SET status = 'failed' WHERE key = $1`,
[idempotencyKey]
);
throw err;
}
}
What if a client sends two requests with the same idempotency key but different bodies — say, $50 the first time and $100 the second time? This is a client bug, but you should handle it gracefully. The best practice is to hash the request body and store it with the key. If a new request arrives with the same key but a different hash, return a 422 error rather than silently using the old result or running the new request.
The Deduplication Window
You cannot store idempotency keys forever. Storage costs money. More importantly, your keys table would grow without bound. Every real implementation has a deduplication window — a period of time after which an idempotency key is deleted and a new request with that key is treated as a fresh operation.
This is not a flaw in the design. It is an explicit, documented contract between the server and its clients.
The question is: how long should the deduplication window be?
The answer is determined by your client's retry behavior. If your client retries for up to 1 hour with exponential backoff, your deduplication window must be longer than 1 hour — otherwise a late retry could be treated as a new request. A common rule of thumb is to set the window to at least 2-3x your client's maximum retry duration.
Stripe, for example, uses a 24-hour deduplication window. Most payment systems use between 24 hours and 7 days. Longer windows consume more storage but provide more safety.
When the deduplication window expires, the behavior depends on what the operation does. For operations that are idempotent by design (like database upserts), expiry doesn't matter — the retry is a no-op. For operations like external API calls (charging a card), expiry of the deduplication window means a late retry could result in a real duplicate action.
This is why your client must not retry indefinitely. Set an explicit maximum retry duration, and make sure it is shorter than the server's deduplication window. Document both values. If they diverge over time (client team extends retry window, server team reduces TTL), you get silent bugs.
Making Non-Idempotent Operations Idempotent
Idempotency keys handle the general case. But there are simpler, lower-cost approaches that work well for specific types of operations. Before reaching for an idempotency key, check whether the operation can be made inherently idempotent by changing its form.
Use absolute values instead of deltas
Increments are not idempotent. But setting an absolute value is.
-- NOT idempotent: if retried, balance is wrong
UPDATE accounts SET balance = balance - 50 WHERE id = 42;
-- Idempotent: always produces the same result
UPDATE accounts SET balance = 950 WHERE id = 42 AND balance = 1000;
The second form is a conditional write — it only applies if the current balance matches what we expect. If it doesn't match (because the operation already ran), it is a no-op. This is the same technique used by optimistic locking, but here we are using it for idempotency.
Use upserts with natural keys
If your operation creates or updates a record, use the natural identity of the record as the deduplication key, rather than a separate idempotency key system.
-- Creates a row if it doesn't exist, does nothing if it does
INSERT INTO orders (order_id, user_id, amount, status)
VALUES ('ord_abc123', 42, 50, 'pending')
ON CONFLICT (order_id) DO NOTHING;
-- Or update only specific fields if already exists
INSERT INTO orders (order_id, user_id, amount, status)
VALUES ('ord_abc123', 42, 50, 'pending')
ON CONFLICT (order_id) DO UPDATE SET
updated_at = NOW() -- only safe to update fields that don't change meaning
WHERE orders.status = 'pending';
This is often all you need for database operations. The natural key (order ID, event ID, payment reference) is the idempotency key. No separate mechanism required.
Example: Payment Processing
Payment systems are where idempotency mistakes have the most visible consequences. Let's walk through a complete design.
The flow: a user clicks "Pay." Your backend calls a payment provider (Stripe, Braintree, etc.) to charge the card. The payment provider charges the card. Your backend records the result. Your backend sends an email receipt.
There are four places where a failure can cause a retry:
- Between your backend and the payment provider — you call the provider, the call times out, you don't know if the charge happened.
- Between the provider's processing and the provider's response — the charge happened, but the response was lost in transit.
- Between your backend recording the result and the caller's confirmation — you updated your database, but your response to the frontend was lost.
- The entire request, retried by the client from scratch.
The correct design handles all four:
- Frontend generates a payment intent key before calling the backend. Stores it locally. Uses the same key if it retries.
- Backend passes the same key to the payment provider. Stripe, Braintree, and most payment APIs accept an idempotency key header. If the same key arrives twice, they return the result of the original charge — the card is not charged again.
- Backend records the result with the key. If the backend crashes after the charge succeeds but before it records, the next retry replays the same key to the payment provider, gets the same charge ID back, and records it.
- Email sending is a separate, independent idempotent operation. It has its own key. Sending the email twice is undesirable but non-catastrophic compared to charging twice.
Most production payment APIs have idempotency key support built in. Stripe has had it since 2013. When you pass the same idempotency key to Stripe's API, it returns the same response — whether the original succeeded, failed, or is still in progress. This means your idempotency problem for the external call is largely solved by your payment provider. Your job is to make sure your own service handles retries correctly around that call.
Example: Email Sending
Email is a good example of an operation where idempotency is harder than it looks, but where partial idempotency is often acceptable.
A fully idempotent email system needs to store which emails have been sent, keyed by the logical event that triggered them, and check that store before sending. If the store says "sent," skip the send.
-- Store per email send attempt
CREATE TABLE email_log (
event_id TEXT NOT NULL, -- e.g., 'order_confirm:ord_abc123'
email_type TEXT NOT NULL,
sent_at TIMESTAMPTZ,
PRIMARY KEY (event_id, email_type)
);
-- Before sending, check the log
INSERT INTO email_log (event_id, email_type, sent_at)
VALUES ('order_confirm:ord_abc123', 'receipt', NOW())
ON CONFLICT (event_id, email_type) DO NOTHING
RETURNING *;
-- If RETURNING is empty, this event was already sent — skip the send
In practice, many teams accept a small risk of duplicate emails for non-financial cases (password reset, welcome email), because the storage cost of a deduplication table and the complexity of the check may outweigh the occasional duplicate. The important thing is to make a deliberate decision — not to forget the problem entirely.
Idempotency in Message Queues
Message queues deserve special attention because they make at-least-once delivery explicit in their contract. Every major queue — SQS, Kafka, Pub/Sub, RabbitMQ — guarantees at-least-once delivery by default. Your consumer will receive the same message more than once. Designing for this is not optional.
The good news is that message queues give you natural idempotency keys for free: the message ID. Every message in every major queue has a unique, stable identifier. Store that ID in your processing log and use it for deduplication.
async function processMessage(message) {
const alreadyProcessed = await db.query(
`INSERT INTO processed_messages (message_id, processed_at)
VALUES ($1, NOW())
ON CONFLICT (message_id) DO NOTHING
RETURNING message_id`,
[message.id]
);
if (alreadyProcessed.rows.length === 0) {
return; // already handled, skip
}
await doActualWork(message.body);
}
A few important nuances with message queues:
- SQS does not guarantee message ID uniqueness across all time — in rare cases, the same logical message can have different IDs on different deliveries. A more robust approach is to include a business-level identifier in the message body itself.
- Kafka's "at-least-once" applies per partition. If your consumer group rebalances, a message that was being processed but not yet committed may be redelivered. Your processing must be idempotent at the application level, not just the Kafka level.
- The deduplication store must be durable. Storing processed message IDs in memory defeats the purpose — a consumer restart will lose the list and reprocess everything.
Idempotency handles duplicates. It does not handle out-of-order delivery. If message B arrives before message A, and both are processed, making the processing idempotent does not help — you may process B first and then A overwrites something it shouldn't. Ordering is a separate concern, handled by Kafka's partition ordering guarantee or by storing version numbers and rejecting stale updates.
Database-Level Idempotency Patterns
Beyond the patterns we've seen, there are a few database-level techniques worth calling out explicitly.
Conditional updates with version numbers
This is the optimistic locking pattern applied to idempotency. Every row has a version number. Updates only succeed if the version matches what you expect. This prevents stale retries from overwriting newer state.
UPDATE orders
SET status = 'shipped', version = version + 1
WHERE id = 42
AND version = 3; -- only applies if we're working from the right version
-- If 0 rows updated: either the row doesn't exist,
-- or the version has moved on (someone else updated it).
-- Either way, this is safe to retry without side effects.
Insert with natural uniqueness constraint
If the business domain has a natural identity for the record, declare it as a unique constraint and use ON CONFLICT DO NOTHING. The database enforces the idempotency for you.
-- One row per user per day — natural constraint
CREATE UNIQUE INDEX ON daily_summaries (user_id, summary_date);
INSERT INTO daily_summaries (user_id, summary_date, total_spent)
VALUES (42, '2024-01-15', 150)
ON CONFLICT (user_id, summary_date) DO NOTHING;
Atomic check-and-set
For state transitions — like moving an order from "pending" to "processing" — you can make the transition idempotent by only allowing it to happen from the specific prior state.
-- Only transition if currently in the expected state
UPDATE orders
SET status = 'processing'
WHERE id = 'ord_abc123'
AND status = 'pending'; -- no-op if already processing
-- Check the result:
-- 1 row updated = transition happened
-- 0 rows updated = already transitioned (idempotent no-op)
-- OR order doesn't exist (error to handle separately)
Common Pitfalls
Storing the idempotency key after the side effect
The key must be stored before the operation executes, not after. If you execute the operation first and then store the key, a crash between those two steps leaves the operation executed but the key unrecorded. The next retry executes again. This is the most dangerous implementation order.
Idempotency without atomicity
Idempotency and atomicity are related but different. An idempotent operation that is not atomic can still leave your system in a bad state if it fails halfway through. For example, suppose a payment flow involves: (1) charge card, (2) create order record, (3) decrement inventory. If step 2 fails after step 1 succeeds, retrying the whole flow with the same idempotency key will not charge the card again — but steps 2 and 3 will run again, potentially creating duplicate order records.
The solution is to wrap all the side effects in a database transaction where possible, or use the Saga pattern (covered in Chapter 13) for operations that span multiple services.
Different services, same key space
If multiple services share the same idempotency key store, a key generated by Service A might collide with a key expected by Service B. Always namespace your keys: prefix them with the service name, operation type, or tenant identifier. payment:charge:abc-123 is far safer than just abc-123.
Returning different responses on replay
When you replay a response from an idempotency key store, return the exact same response body and status code as the original. Do not re-compute. Do not re-query the database for fresh data and return that instead. The client made two requests that it believes are the same logical operation — it should get the same answer both times. Returning a different status code on replay (e.g., 200 first, then 409 on replay) will confuse well-written clients.
Forgetting that failures are also idempotency key results
What happens when the original request fails — the card was declined, the inventory was zero? You should store the failure result under the idempotency key just like a success. If the client retries, return the same failure response. Do not retry the operation on the client's behalf just because it failed the first time. The client knows it got a failure; it is retrying because it wasn't sure whether the failure was real or a network error. Your job is to tell it: yes, this really failed.
Bringing It Together
Idempotency is not one thing. It is a collection of techniques that share the same goal: making it safe for an operation to execute more than once without producing incorrect results. Some operations are naturally idempotent. Others need to be made idempotent through idempotency keys, conditional writes, upserts, or careful state machines.
The pattern that handles the widest range of cases is: generate a unique key at the caller, pass it with every attempt, store it on the server before executing, and record the result when done. This gives you a deduplication mechanism that works across service restarts, network failures, and concurrent retries.
The mental shift required is subtle but important. Instead of asking "how do I prevent retries?", ask "how do I make retries safe?" Retries are a feature of reliable systems, not a bug. The goal is not to eliminate them — it is to ensure that they are harmless.
Combined with at-least-once delivery, idempotency gives you the practical equivalent of exactly-once semantics without needing a fundamentally different transport layer. It is the single highest-leverage correctness technique available to you as a distributed systems designer.