Last modified: March 23, 2026
This article is written in: 🇺🇸
Caching is a technique used to speed up data retrieval by placing frequently accessed or computationally heavy information closer to the application or the end user. Below is an expanded set of notes on caching, presented with ASCII diagrams and bullet points that emphasize key considerations. Each bullet point is a complete sentence containing a single bold word in the middle.
Request Flow (Cache Hit vs. Cache Miss)
+-----------+ +------------------+ +------------------+
| | 1. Request | | | |
| Client +----------------->+ Cache | | Data Store |
| | | (Fast Access) | | (Disk / DB) |
| | 2a. Cache Hit | | | |
| |<-----------------+ +-----------+ | 3. Fetch | +------------+ |
| | | | Key: Val | | on Miss | | Tables | |
+-----------+ | | Key: Val | +--------------->+ | Rows | |
| +-----------+ | 4. Return | | Docs | |
| |<---------------+ +------------+ |
| 5. Store result | | |
+------------------+ +------------------+
Modern computing stacks use multiple caches at different layers, each addressing a specific scope and performance requirement.
The cache-aside pattern is one of the most widely used strategies, where the application code manages reads and writes to the cache explicitly.
Cache-Aside Read Flow
+-------------+ +--------------+ +----------------+
| | 1. Get | | | |
| Application +------->+ Cache | | Data Store |
| Code | | | | |
| | HIT? | | | |
| |<-------+ | | |
| | | +-------+--------+
| | Yes: return value | ^ |
| | | | |
| | No: | | |
| +-- 2. Query DB ----------------------->+ |
| |<--- 3. Return result -----------------+ |
| +-- 4. Store in cache -->+ | |
+-------------+ +--------------+
Different write policies determine how the cache interacts with the underlying data store during write operations.
Write-Through Write-Around Write-Back (Write-Behind)
App App App
| | |
| 1. Write | 1. Write | 1. Write
v v v
Cache ---- 2. Write ---> DB | Cache
| (synchronous) +--- 2. Write ---> DB |
| | (bypass cache) | 2. Ack to app
| 3. Ack | 3. Ack | (immediate)
v v |
App App +--- 3. Async flush ---> DB
| (batched / delayed)
* Consistent * Avoids cache pollution |
* Higher write latency * Cache miss on next read * Fastest writes
* Risk of data loss
| Policy | Write Latency | Read After Write | Consistency | Data Loss Risk |
| Write-Through | Higher (two writes) | Always a cache hit | Strong | Very low |
| Write-Around | Lower (one write) | Cache miss until next read | Eventual | Very low |
| Write-Back | Lowest (cache only) | Always a cache hit | Eventual | Higher if cache crashes |
When the cache is full, an eviction policy determines which items to discard so that new items can be stored.
Eviction Policy Comparison
FIFO (First In, First Out) LRU (Least Recently Used)
+---+---+---+---+---+ +---+---+---+---+---+
| A | B | C | D | E | <- Full | A | B | C | D | E | <- Full
+---+---+---+---+---+ +---+---+---+---+---+
^ ^
| Evict A (oldest arrival) | Evict A (longest since
| regardless of access | last access)
+-- Insert F here +-- Insert F here
LFU (Least Frequently Used) Random Replacement
+---+---+---+---+---+ +---+---+---+---+---+
| A | B | C | D | E | <- Full | A | B | C | D | E | <- Full
+---+---+---+---+---+ +---+---+---+---+---+
^ ^
| Evict B (fewest total | Evict C (chosen
| accesses over lifetime) | at random)
+-- Insert F here +-- Insert F here
| Policy | Tracking Overhead | Best For | Weakness |
| FIFO | Minimal (insertion order) | Uniform access patterns | Ignores access frequency |
| LRU | Moderate (last access time) | Temporally clustered reads | Scan pollution from one-time reads |
| LFU | Higher (access counters) | Skewed popularity distributions | Stale popular items linger |
| Random | None | Simple implementations | No adaptation to workload |
A cache stampede occurs when a frequently accessed cache entry expires and many concurrent requests simultaneously attempt to regenerate it, overwhelming the data store.
Cache Stampede Scenario Mitigation: Locking / Lease
TTL expires
Request 1 --+ | Req 1 --> MISS --> Acquire lock --> Query DB
Request 2 --+ v |
Request 3 --+ +---------+ +----------+ Req 2 --> MISS --> Lock held, wait |
Request 4 --+->| Cache +->|Data Store| Req 3 --> MISS --> Lock held, wait |
... | MISS! +->| | v
Request N --+ +---------+->|Overloaded| Cache now populated <----------------+
+----------+ Req 2 --> HIT (served from cache)
Req 3 --> HIT (served from cache)
Cache warming is the practice of pre-populating the cache before it begins serving live traffic, preventing a flood of cold misses on startup.
When caches span multiple nodes, additional challenges arise around partitioning, replication, and network overhead.
Ensuring that the cache reflects changes in the underlying data store can be one of the most difficult aspects of caching.
Some architectures employ multiple cache layers, each targeting different bottlenecks or data usage patterns.
| Client |
| (Browser Cache) |
+--------+----------+
|
v
+-------------------+ Serves static assets
| Reverse Proxy / +--------> from edge locations
| CDN |
+--------+----------+
|
v
+-------------------+ Local in-process cache
| Application Server+--------> (HashMap, Guava, etc.)
+--------+----------+
|
v
+-------------------+ Redis / Memcached
| Distributed Cache +--------> shared cluster
+--------+----------+
|
v
+-------------------+
| Database or |
| Persistent Store |
+-------------------+
Effective caching strategies rely on continuous monitoring and tuning based on real-world usage patterns.