Idempotency in Distributed Systems

The "Did the Request Get Through?" Problem

You're a payment service. A client sends "charge $100 from Alice." You charge the card. Right before you send the success response, the network drops.

The client doesn't know if the charge happened. So they retry. You receive the request again. Do you charge $100 again?

Without idempotency, yes. Alice gets charged twice. Lawsuit-level problem.

With idempotency: no. The retried request returns the same result as the first one without re-executing. Alice is charged exactly once.

This is one of the most important problems in distributed systems because networks fail in the middle of every request type, all the time. Retries are the only sane response. But retries only work safely if operations are idempotent.

What Idempotency Actually Means

An operation is idempotent if performing it once has the same effect as performing it many times. set x = 5 is idempotent (the result is the same after one or ten executions). increment x is not (running it twice produces a different result than once).

Standard HTTP methods come with idempotency expectations:

GET: idempotent. Reading data many times is fine.
PUT: idempotent. "Set this resource to this state" gives the same result every time.
DELETE: idempotent. Deleting an already-deleted thing is a no-op.
POST: NOT idempotent by default. "Create a new order" creates a new one each time you call it.

The whole problem revolves around making POST-like operations safe to retry.

The Idempotency Key Pattern

The standard way to make operations idempotent: the client generates a unique ID per logical operation and sends it with the request. The server stores the result keyed by that ID. If the same ID arrives again, return the stored result instead of re-executing.

Idempotency Key Flow

Client generates a unique key (UUID)

Per logical operation. Same key for retries of the same operation.

Client sends key in header: Idempotency-Key: abc-123

Standard header used by Stripe, AWS, others.

Server checks: have I seen this key before?

Lookup in fast storage (Redis or a dedicated table).

If new: execute, store result, return it

Result includes status code and response body.

If seen: return the stored result without re-executing

Same exact response as the first time. Safe.

Stripe's API as a Reference

Stripe popularized this pattern. From their docs:

POST /v1/charges
Idempotency-Key: 3a8f04ec-9e6c-4c7d-8c2f-5e1d4e2b3a8f
{
  "amount": 1000,
  "currency": "usd",
  "source": "tok_xxx"
}

Stripe stores the request and response keyed by Idempotency-Key for 24 hours. Retry within that window: same response. Retry after: treated as a new request.

Implementation Tactics

1. Database Constraint

Add a unique constraint on the idempotency key:

CREATE TABLE charges (
    id UUID PRIMARY KEY,
    idempotency_key VARCHAR(255) UNIQUE NOT NULL,
    amount DECIMAL,
    status VARCHAR(50),
    created_at TIMESTAMP
);

Try to insert with the new key. If you get a unique violation, look up the existing row and return its result. Atomic and reliable.

2. Redis with TTL

Faster but more nuanced. Store key-to-result mapping in Redis with a 24-hour TTL.

def handle_request(idem_key, request):
    cached = redis.get(f"idem:{idem_key}")
    if cached:
        return json.loads(cached)

    # Use SETNX or a distributed lock to prevent races during execution
    lock = redis.set(f"idem-lock:{idem_key}", "1",
                     nx=True, ex=60)
    if not lock:
        # Another retry of the same key is in flight
        return wait_for_result(idem_key)

    try:
        result = execute(request)
        redis.set(f"idem:{idem_key}", json.dumps(result), ex=86400)
        return result
    finally:
        redis.delete(f"idem-lock:{idem_key}")

3. Natural Idempotency Keys

Sometimes the data itself uniquely identifies the operation. An order with order_id "ord_abc" can use that as the idempotency key. No separate header needed. Stripe + Shopify webhooks do this.

The Edge Cases

Concurrent retries: the client sends the request, times out, retries before the original completes. Two requests with the same key are in flight simultaneously. Solution: a distributed lock per key. Only one runs; the other waits or returns "in progress."

Different request body, same key: a buggy client reuses the key for different operations. Solution: hash the request body and store it with the key. If a retry comes with the same key but different body, return an error.

Storage failure during execution: you executed the operation but couldn't store the result. Next retry will execute again. To avoid this, store "in-progress" status before execution and "complete" after, with idempotent execution itself if possible.

Long-running operations: a charge takes 5 seconds. The client retries after 3. Now you have two concurrent attempts. The lock pattern handles this.

Operations That Are Naturally Idempotent

Some operations don't need idempotency keys because they're already safe to repeat:

UPSERT operations: "set user X to email Y." Running twice produces the same state.
Conditional updates: "set order to shipped IF currently in pending state." Second call is a no-op.
State machines: only allow valid transitions. "Mark as paid" only works once because the state changes.

If you can design your operations to be naturally idempotent, you don't need keys. But for operations that genuinely create new things (charges, orders, account creations), keys are the way.

Idempotency in Event Processing

Same problem in messaging. Kafka, RabbitMQ, SQS deliver messages "at least once," meaning your consumer might see the same event twice. To avoid double-processing:

Each event has a unique ID.
Consumer tracks which event IDs it has processed.
Skip events whose IDs have already been seen.

This makes the consumer effectively idempotent even though the message bus is not.

The One Thing to Remember

In a distributed system, the answer to "what if this request gets sent twice?" must always be: nothing bad. Idempotency is the property that makes that true. Idempotency keys are the standard tool for state-changing operations that aren't naturally idempotent. Bake them into your API design from day one. Adding them later is much harder than adding them first.