Distributed Locks - Mohammadali Bazyar

Why You Might Need One

You have multiple instances of a service running, and only one of them should perform some operation at a time. Examples:

A scheduled job that should run once across the cluster, not once per node.
Updating a shared resource where two simultaneous updates would corrupt it.
Coordinating expensive work so it isn't duplicated by parallel workers.
Ensuring a single leader for a task at any given time.

Inside a single process, you'd use a mutex. Across many machines, you need a distributed lock.

The Surprise: They're Much Harder Than They Look

Distributed locks are one of the most overengineered and underdesigned things in software. People reach for them constantly, often as the wrong solution. Even when they're the right solution, getting them right is genuinely difficult because you have to handle: network partitions, process pauses (GC stalls!), clock drift, and the possibility that the lock holder dies without releasing the lock.

Naive Approach: Set a Key in Redis

# Acquire
acquired = redis.set("my-lock", node_id, nx=True, ex=30)

if acquired:
    do_work()
    redis.delete("my-lock")

This looks reasonable. NX means "set only if not exists." EX 30 means "auto-expire after 30 seconds" so a dead holder doesn't block forever.

It works most of the time. But it has subtle bugs that can cause real problems.

Bug 1: Wrong Holder Releases the Lock

Process A acquires the lock. A's GC pauses for 35 seconds. The lock expires. Process B acquires it. A wakes up, finishes its work, calls redis.delete("my-lock"). It just deleted B's lock. Now C can acquire it while B still thinks it holds it.

Fix: only delete the lock if YOU still hold it. Use a Lua script for atomic check-and-delete:

-- release.lua
if redis.call("get", KEYS[1]) == ARGV[1] then
    return redis.call("del", KEYS[1])
else
    return 0
end

Pass your unique node_id as ARGV[1]. The Lua script ensures the get-and-check-and-delete is atomic.

Bug 2: The Lock Expires Mid-Operation

Process A acquires the lock with a 30-second TTL. The work takes 35 seconds. The lock expires. B acquires it. Now A and B are both doing the work concurrently, which is exactly what the lock was supposed to prevent.

Fix options:

Watchdog / lease renewal: a background thread renews the TTL every 10 seconds while you hold the lock. Stop renewing when done.
Make work shorter than the TTL: ensure the protected operation always finishes well within the lock duration.
Fencing tokens (next section): tolerate the case where two processes briefly hold the lock by ensuring downstream systems reject the older holder.

Fencing Tokens: The Key Idea

Martin Kleppmann famously argued that distributed locks alone are not enough for correctness. The lock can't prevent process pauses; you also need the storage layer to enforce ordering.

Solution: every time the lock is acquired, the lock service issues a monotonically increasing fencing token. The protected resource (database, file system, whatever) checks the token on every operation and rejects writes from older tokens.

Fencing Tokens

A acquires lock, gets token 33

A pauses (GC), lock expires

B acquires lock, gets token 34

B writes with token 34: ACCEPTED

A wakes up, tries to write with token 33

A's write with token 33: REJECTED (storage saw 34 already)

This solves the GC-pause problem at the storage layer, where it can actually be enforced atomically. But it requires the storage to participate in the protocol.

Redlock: Redis's Multi-Node Approach

To handle the case where Redis itself fails, Antirez (Redis author) proposed Redlock: instead of one Redis node, deploy 5. To acquire the lock, get it on a majority (3 of 5). Same for release.

Redlock has been controversial. Kleppmann argued it's not safe under certain failure scenarios. Antirez pushed back. The truth: Redlock works most of the time but has edge cases. For most use cases (where occasional double-execution is annoying but not catastrophic), it's good enough. For correctness-critical scenarios, use a real consensus system (etcd, ZooKeeper) instead.

ZooKeeper / etcd / Consul

The "right" way to do distributed locks. These systems use real consensus (Zab, Raft) and provide ephemeral nodes that auto-release on session loss.

The pattern with ZooKeeper:

1. Try to create an ephemeral sequential znode under /locks/my-resource/.
2. List all children. If yours is the lowest sequence number, you have the lock.
3. If not, watch the next-lower one. When it disappears, check again.
4. Hold the lock as long as your session is alive. Disconnect = automatic release.

This avoids the GC-pause problem because losing your ZooKeeper session releases the lock instantly. But your code still needs to handle "I lost my session" by aborting the work.

When You Don't Actually Need a Distributed Lock

Many problems people try to solve with distributed locks have better solutions:

Idempotent operations: if your operation is idempotent (safe to retry), you don't need a lock. Just retry. See the idempotency article.
Optimistic concurrency: use compare-and-swap. Read a version number, do work, write back only if the version is unchanged. The database handles concurrency.
Atomic database operations: if the database supports the operation atomically (UPDATE with WHERE clause, INCR, etc.), use that instead.
Single-instance scheduling: use a job scheduler with built-in leader election (Kubernetes CronJob with concurrency policy, Sidekiq with unique jobs, etc.).
Leader election service: if you need "only one of these does the work," elect a leader and skip the lock entirely.

The One Thing to Remember

Distributed locks are a tool, not a solution. They protect the rare case where you really need mutual exclusion across machines and there's no alternative. For most concurrency problems, the right answer is to redesign the operation to be idempotent or use the database's built-in atomic operations. When you do need a real distributed lock, use ZooKeeper or etcd, add fencing tokens at the storage layer, and assume the worst about timing.