Last modified: March 23, 2026

This article is written in: 🇺🇸

Caching

Caching is a technique used to speed up data retrieval by placing frequently accessed or computationally heavy information closer to the application or the end user. Below is an expanded set of notes on caching, presented with ASCII diagrams and bullet points that emphasize key considerations. Each bullet point is a complete sentence containing a single bold word in the middle.

Request Flow (Cache Hit vs. Cache Miss)

  +-----------+                  +------------------+                +------------------+
  |           |  1. Request      |                  |                |                  |
  |  Client   +----------------->+      Cache       |                |   Data Store     |
  |           |                  |   (Fast Access)  |                |   (Disk / DB)    |
  |           |  2a. Cache Hit   |                  |                |                  |
  |           |<-----------------+  +-----------+   |  3. Fetch      |  +------------+  |
  |           |                  |  | Key: Val  |   |  on Miss       |  | Tables     |  |
  +-----------+                  |  | Key: Val  |   +--------------->+  | Rows       |  |
                                 |  +-----------+   |  4. Return     |  | Docs       |  |
                                 |                  |<---------------+  +------------+  |
                                 |  5. Store result |                |                  |
                                 +------------------+                +------------------+

Types of Cache

Modern computing stacks use multiple caches at different layers, each addressing a specific scope and performance requirement.

Cache-Aside (Lazy Loading) Pattern

The cache-aside pattern is one of the most widely used strategies, where the application code manages reads and writes to the cache explicitly.

Cache-Aside Read Flow

  +-------------+        +--------------+        +----------------+
  |             | 1. Get |              |        |                |
  | Application +------->+    Cache     |        |   Data Store   |
  |   Code      |        |              |        |                |
  |             |  HIT?  |              |        |                |
  |             |<-------+              |        |                |
  |             |                       |        +-------+--------+
  |             | Yes: return value     |                ^  |
  |             |                       |                |  |
  |             | No:                   |                |  |
  |             +-- 2. Query DB ----------------------->+  |
  |             |<--- 3. Return result -----------------+  |
  |             +-- 4. Store in cache -->+              |   |
  +-------------+                       +--------------+

Cache Write Policies

Different write policies determine how the cache interacts with the underlying data store during write operations.

Write-Through              Write-Around               Write-Back (Write-Behind)

  App                        App                         App
   |                          |                           |
   | 1. Write                 | 1. Write                  | 1. Write
   v                          v                           v
  Cache ---- 2. Write ---> DB |                          Cache
   |         (synchronous)    +--- 2. Write ---> DB       |
   |                          |    (bypass cache)         | 2. Ack to app
   | 3. Ack                   | 3. Ack                    |    (immediate)
   v                          v                           |
  App                        App                          +--- 3. Async flush ---> DB
                                                          |    (batched / delayed)
   * Consistent               * Avoids cache pollution    |
   * Higher write latency     * Cache miss on next read   * Fastest writes
                                                          * Risk of data loss

Policy Write Latency Read After Write Consistency Data Loss Risk
Write-Through Higher (two writes) Always a cache hit Strong Very low
Write-Around Lower (one write) Cache miss until next read Eventual Very low
Write-Back Lowest (cache only) Always a cache hit Eventual Higher if cache crashes

Cache Eviction Policies

When the cache is full, an eviction policy determines which items to discard so that new items can be stored.

Eviction Policy Comparison

  FIFO (First In, First Out)           LRU (Least Recently Used)
  +---+---+---+---+---+               +---+---+---+---+---+
  | A | B | C | D | E |  <- Full      | A | B | C | D | E |  <- Full
  +---+---+---+---+---+               +---+---+---+---+---+
    ^                                    ^
    |  Evict A (oldest arrival)          |  Evict A (longest since
    |  regardless of access              |  last access)
    +-- Insert F here                    +-- Insert F here

  LFU (Least Frequently Used)          Random Replacement
  +---+---+---+---+---+               +---+---+---+---+---+
  | A | B | C | D | E |  <- Full      | A | B | C | D | E |  <- Full
  +---+---+---+---+---+               +---+---+---+---+---+
        ^                                       ^
        |  Evict B (fewest total                |  Evict C (chosen
        |  accesses over lifetime)              |  at random)
        +-- Insert F here                       +-- Insert F here

Policy Tracking Overhead Best For Weakness
FIFO Minimal (insertion order) Uniform access patterns Ignores access frequency
LRU Moderate (last access time) Temporally clustered reads Scan pollution from one-time reads
LFU Higher (access counters) Skewed popularity distributions Stale popular items linger
Random None Simple implementations No adaptation to workload

Cache Stampede and Thundering Herd

A cache stampede occurs when a frequently accessed cache entry expires and many concurrent requests simultaneously attempt to regenerate it, overwhelming the data store.

Cache Stampede Scenario                     Mitigation: Locking / Lease

                  TTL expires
  Request 1 --+       |                      Req 1 --> MISS --> Acquire lock --> Query DB
  Request 2 --+       v                                                          |
  Request 3 --+  +---------+  +----------+  Req 2 --> MISS --> Lock held, wait   |
  Request 4 --+->|  Cache  +->|Data Store|  Req 3 --> MISS --> Lock held, wait   |
  ...            |  MISS!  +->|          |                                       v
  Request N --+  +---------+->|Overloaded|  Cache now populated <----------------+
                              +----------+  Req 2 --> HIT (served from cache)
                                            Req 3 --> HIT (served from cache)

Cache Warming

Cache warming is the practice of pre-populating the cache before it begins serving live traffic, preventing a flood of cold misses on startup.

Distributed Cache Considerations

When caches span multiple nodes, additional challenges arise around partitioning, replication, and network overhead.

Cache Invalidation and Consistency

Ensuring that the cache reflects changes in the underlying data store can be one of the most difficult aspects of caching.

Multi-Layer Caching

Some architectures employ multiple cache layers, each targeting different bottlenecks or data usage patterns.

|      Client       |
  |  (Browser Cache)  |
  +--------+----------+
           |
           v
  +-------------------+          Serves static assets
  |  Reverse Proxy /  +-------->  from edge locations
  |       CDN         |
  +--------+----------+
           |
           v
  +-------------------+          Local in-process cache
  | Application Server+-------->  (HashMap, Guava, etc.)
  +--------+----------+
           |
           v
  +-------------------+          Redis / Memcached
  | Distributed Cache +-------->  shared cluster
  +--------+----------+
           |
           v
  +-------------------+
  |  Database or      |
  |  Persistent Store |
  +-------------------+

Monitoring and Metrics

Effective caching strategies rely on continuous monitoring and tuning based on real-world usage patterns.