Caching Strategies

First, What Even Is a Cache?

Imagine you work at a library. Every time someone asks for a book, you walk all the way to the back warehouse (the database), find the book, walk back, and hand it to them. That takes time.

Now imagine you keep a small shelf right at the front desk (the cache). When someone asks for a popular book, you check the front shelf first. If it is there, great, hand it over instantly. If not, you walk to the warehouse, grab it, and also put a copy on the front shelf so next time it is faster.

That is caching. A small, fast storage layer that sits between the application and the slow database.

Now, the question is: when do you put things on that shelf, and when do you update them? That is where these 5 strategies come in.

1. Cache Aside (also called "Lazy Loading")

Cache Aside Flow

Application

1 read -->

Cache

2 cache miss

Application

3 read DB -->

Database

4 get data <--

Cache

5 update cache

Pros

Update logic is on application level, easy to implement
Cache only contains what the application requests

Cons

Each cache miss results in 3 trips
Data may be stale if DB is updated directly

The application is in charge of everything. The cache is just a dumb shelf. It does not know anything about the database. The application manages both.

Step by step:

1. App needs some data. It checks the cache first.
2. Cache hit? Data is there. Return it. Done. Fast.
3. Cache miss? Data is NOT there. Now the app goes to the database itself, gets the data, returns it to the user, and also stores a copy in the cache so next time it will be a hit.

The key thing: the application does all the work. The cache does not talk to the database. The app talks to both, separately.

When does the cache get updated on writes?
It usually does not, directly. When someone updates data in the database, the cache still has the old version. The app typically either invalidates (deletes) the cache entry so the next read fetches fresh data, or just lets it expire after a timeout (TTL).

Real-world analogy:
You are a waiter. A customer asks for a dish. You check your notepad (cache). Do you remember the recipe? No? You walk to the kitchen (database), get it, serve it, and write it down on your notepad. Next time someone asks for the same dish, you already have it written down.

2. Read Through

Read Through Flow

Application

1 read -->

Cache

2 cache miss

Cache

3 read DB -->

Database

4 get data <--

Cache

5 update cache

Pros

Application logic is simple
Can easily scale reads, only one query hits the DB

Cons

Data access logic is in the cache, needs a plugin to access DB
If cache goes down, reads break

This looks similar to Cache Aside, but there is one crucial difference: the cache itself is responsible for loading data from the database. The application does not talk to the database at all for reads.

Step by step:

1. App needs data. It asks the cache.
2. Cache hit? Return it. Done.
3. Cache miss? The cache itself goes to the database, fetches the data, stores it, and then returns it to the application.

From the application's perspective, it only ever talks to the cache. It does not even know the database exists (for reads). The cache is like a smart middleman.

How is this different from Cache Aside?
In Cache Aside, YOUR APP fetches from the DB and writes to the cache. In Read Through, THE CACHE does that itself. Your app just says "give me this data" and the cache figures it out.

Real-world analogy:
Instead of you (the waiter) going to the kitchen yourself, you have a smart assistant standing between you and the kitchen. You ask the assistant for a dish. If the assistant has it memorized, they tell you immediately. If not, THEY go to the kitchen, learn the recipe, memorize it, and tell you. You never step foot in the kitchen.

3. Write Around

Write Around Flow

Application

1 write DB -->

Database

Application

2 read from cache (if exists) -->

Cache

3 read from DB (if not in cache) -->

Database

Pros

The DB is the source of truth
Lower read latency for cached data

Cons

Higher write latency, data is written to DB first
Data in the cache may be stale

This one is about what happens when you write (create or update) data.

Step by step:

1. App needs to write data. It writes directly to the database, completely bypassing the cache. The cache is not updated.
2. App needs to read data. It checks the cache.
3. Cache hit? Return it.
4. Cache miss? Read from DB, populate the cache.

So writes skip the cache entirely. The cache only gets populated through reads.

When does this make sense?
Think of a logging system. You are writing millions of log entries per second, but you rarely read them. Why fill the cache with log entries nobody will ever look at? Write them straight to the database. If someone does search for a specific log, the read will populate the cache then.

Real-world analogy:
You are organizing a warehouse. When new inventory arrives, you put it straight into the warehouse (database) without bothering to update the front shelf (cache). If someone comes asking for that item, you notice it is not on the shelf, walk to the warehouse, grab it, and NOW put it on the shelf. You only stock the shelf with things people actually ask for.

4. Write Back (also called "Write Behind")

Write Back Flow

Application

1 write constantly -->

Cache

2 write to DB once in a while -->

Database

Pros

Lower write latency
Lower read latency
Cache and DB are eventually consistent

Cons

Data loss if cache is down
Infrequent data is also stored in the cache

This is the opposite philosophy of Write Around. Here, writes go to the cache first, and the cache writes to the database later, asynchronously, in batches.

Step by step:

1. App needs to write data. It writes to the cache only. Done. The app moves on.
2. The cache, in the background, collects these writes and flushes them to the database periodically. Maybe every few seconds, or when a batch reaches a certain size.
3. Reads always hit the cache first (which has the freshest data since writes go there first).

What is the downside?
Data loss risk. This is the big one. If the cache crashes BEFORE it has flushed writes to the database, those writes are GONE. They only existed in memory. This is a real and serious risk.

When does this make sense?
High-throughput systems where write speed is critical and you can tolerate some risk of data loss. Think of a gaming leaderboard. Scores update constantly, and it is okay if you lose a few seconds of updates in a rare crash.

Real-world analogy:
You are a cashier. Instead of walking to the safe (database) every time you receive money, you stuff bills into your register drawer (cache). At the end of every hour, you take everything from the drawer and put it in the safe. Super fast during the day. But if someone robs you before you make that trip to the safe, that money is gone.

5. Write Through

Write Through Flow

Application

1 write to cache -->

Cache

2 write to DB immediately -->

Database

Pros

Reads have lower latency
Cache and DB are always in sync
No risk of data loss

Cons

Writes have higher latency, must wait for DB
Infrequent data is also stored in the cache

Writes go to the cache AND the database at the same time (or more precisely, the cache writes to the DB immediately, synchronously, before confirming the write is done).

Step by step:

1. App writes data to the cache.
2. The cache immediately writes that same data to the database. It waits for the DB to confirm.
3. Only then does the cache confirm to the app: "Write complete."
4. Reads hit the cache, which always has the freshest data.

When does this make sense?
Systems where data consistency is non-negotiable and you can afford slower writes. Banking transactions, inventory systems, anything where "the cache says X but the database says Y" would be a disaster.

Real-world analogy:
You are a secretary taking notes. Every time your boss tells you something, you write it on your notepad (cache) AND immediately file a copy in the filing cabinet (database) before telling your boss "got it." It takes longer each time, but if anyone ever checks the filing cabinet, it is guaranteed to match your notepad exactly.

Quick Comparison

Strategy	Writes go to	Reads come from	Speed	Consistency	Data loss risk
Cache Aside	DB (app manages cache separately)	Cache, then DB on miss	Medium	Can be stale	Low
Read Through	DB (cache manages reads)	Cache (cache fetches on miss)	Medium	Can be stale	Low
Write Around	DB only (skip cache)	Cache, then DB on miss	Slow reads after write	Can be stale	Low
Write Back	Cache first, DB later	Cache	Fastest	Eventually consistent	High
Write Through	Cache and DB simultaneously	Cache	Slowest writes	Always consistent	None

The One Thing to Remember

There is no "best" strategy. Each one is a tradeoff between speed, consistency, and complexity. The right choice depends on your system:

Read-heavy, tolerates stale data? Cache Aside or Read Through
Write-heavy, rarely read? Write Around
Need blazing fast writes, can tolerate rare data loss? Write Back
Consistency is sacred, writes can be slow? Write Through

Most real-world systems combine multiple strategies for different types of data within the same application.