When One Database Is Not Enough

A single database server can hold a few terabytes and serve maybe 100,000 queries per second on great hardware. That sounds like a lot, but at scale it runs out fast: Twitter, Facebook, Stripe all have datasets in the petabytes.

You have two options when one machine isn't enough:

Vertical scaling: get a bigger machine. Costs grow exponentially. There's a hard ceiling. Easy to do, until it isn't.
Horizontal scaling: split data across many smaller machines. Costs grow linearly. No ceiling. Hard to do well.

Sharding is horizontal scaling for databases. Each "shard" is a regular database instance holding part of the data. Together they form one logical dataset.

Partitioning vs Sharding

These terms get used interchangeably but have a subtle difference:

Partitioning splits a table into pieces, often inside a single database (Postgres table partitions). The data still lives on one machine.
Sharding splits data across multiple separate database instances on different machines.

Partitioning helps query performance. Sharding helps both query performance AND total capacity. Most large systems do both.

Sharding Strategies

Strategy 1: Range-Based Sharding

Divide the keyspace into contiguous ranges. Each shard owns one range.

Range Sharding by User ID
Shard
Range
Data
Shard 1
user_id 0 to 1M
Oldest users
Shard 2
user_id 1M to 2M
Mid-cohort
Shard 3
user_id 2M to 3M
Newer users

Pros: simple. Range queries (e.g., "all users created last month") are fast because data is contiguous.
Cons: hot shards. New users are usually more active. Shard 3 gets all the traffic while Shard 1 is idle. Same problem with time-based ranges.

Strategy 2: Hash-Based Sharding

Hash the shard key and use modulo (or consistent hashing) to pick a shard.

shard_index = hash(user_id) % number_of_shards

Pros: uniform distribution. Hot user IDs spread across shards naturally. No hot shards.
Cons: range queries are expensive (must hit every shard). Adding a new shard is painful with simple modulo (almost everything moves), so use consistent hashing instead.

This is the most common strategy in practice.

Strategy 3: Geographic Sharding

Shard by location. EU users on EU shard, US users on US shard, Asia users on Asia shard.

Pros: low latency (data is close to users). Compliance (EU data stays in EU per GDPR).
Cons: uneven load (most users might be in one region). Cross-region queries are expensive.

Strategy 4: Directory-Based Sharding

A separate lookup service maps each key to a specific shard. The lookup table can be updated to rebalance without changing keys.

Pros: maximum flexibility. Can move data between shards without changing keys.
Cons: the directory itself becomes a bottleneck and a single point of failure. Adds an extra hop on every query.

Choosing the Shard Key

The shard key is the column you partition on. This is the most important decision in your design. A bad shard key cripples the entire system.

Good shard keys have these properties:

High cardinality (many distinct values, so data spreads evenly).
Even access distribution (no single value gets disproportionate traffic).
Aligns with your most common query patterns (so single-shard queries are common).
Stable (doesn't change over time, since changing the shard key means moving the row).

Bad shard keys:

Boolean fields (only 2 values, terrible distribution).
Status fields where 99% of rows are "active."
Auto-incrementing IDs combined with range sharding (always hits the latest shard).
Customer ID for B2B SaaS where one customer represents 90% of traffic.

The Hotspot Problem

Even with a good strategy, hotspots happen. A celebrity user has 100M followers and their tweets get queried 1000x more than average. Their shard melts under load.

Mitigations:

Replication: read replicas of hot shards distribute read load.
Caching: hot user data in Redis takes pressure off the shard.
Sub-sharding: if one shard is consistently hot, split it further.
Special handling for VIPs: celebrity accounts get dedicated infrastructure outside the shard system.

Cross-Shard Queries Are Painful

Once data is sharded, certain queries become hard:

Joins across shards: if "users" and "orders" are sharded by different keys, joining them means scatter-gather across all shards.
Transactions across shards: two-phase commit is slow and complex. Most sharded systems avoid distributed transactions entirely.
Global ordering: "give me the 100 most recent orders" requires touching every shard.
Analytics: aggregations that span shards. Often solved by replicating data into a separate analytics database.

Design your access patterns first, then pick the shard key that makes the dominant queries single-shard. Cross-shard queries should be the exception, not the rule.

Resharding: The Hardest Part

Eventually you outgrow your initial shard count. Going from 4 shards to 8 means moving roughly half your data. While the system stays online. Without dropping queries.

Strategies:

Pre-sharding: create many "logical shards" (say 1024) and map them onto fewer physical machines. To scale up, redistribute logical shards to new machines, no rehashing.
Consistent hashing: minimizes data movement when adding nodes (only 1/N moves).
Live migration: dual-write to old and new shards during the transition, then cut over reads. Complex but maintains uptime.

Many teams underestimate resharding. It's the kind of project that takes months. Plan your initial shard count generously to delay it.

When Not to Shard

Sharding adds enormous complexity. Only do it when you have to:

Your dataset is too big to fit on one machine.
Your write throughput exceeds what one machine can handle.
You've already exhausted vertical scaling and read replicas.

If you can solve your problem with a bigger machine, more replicas, or better caching, do that first. Sharding is a one-way door. Once you cross it, you cannot easily go back.

The One Thing to Remember

Sharding is the price you pay to scale beyond a single machine. The hard part isn't the splitting itself; it's picking a shard key that makes 95% of your queries single-shard, and having a plan for the 5% that aren't. Get the key right and sharding feels invisible. Get it wrong and you'll be debugging hot shards forever.