Last modified: May 06, 2025

This article is written in: 🇺🇸

Consistent Hashing

Imagine you're organizing books in a vast library with shelves arranged in a circle. Each book’s position is chosen by the first letter of its title, looping back to the beginning after Z. When you install a new shelf or remove one, you’d prefer not to reshuffle every book—only a small, predictable subset should move. Consistent hashing gives distributed computer systems that same flexibility: data spreads evenly across servers, yet when servers come or go only a fraction of keys are remapped.

After working through this material you should be able to answer:

  1. What is consistent hashing, and how does it differ from traditional hashing?
  2. How does the hash‑ring abstraction work, and how are nodes and data mapped onto it?
  3. Why does adding or removing a node trigger only limited data movement?
  4. How do virtual nodes (VNodes) improve load‑balancing and scalability?
  5. Where is consistent hashing used in the real world, and what pitfalls can appear in practice?

The Hash Ring Concept

Consistent hashing imagines the entire hash space as a circle.
A typical implementation uses a 32-bit hash (0 → 2321); after the last value the count “wraps” back to 0, so the ends meet like the edges of a clock dial.

Hash-ring overview

Hash Ring:
                         0°   (hash 0  or 2³²-1)
                         ●
                         │
                         │
   270° (¾·2³²)     ●────┼────●  90° (¼·2³²)
                         │
                         │
                         ●
                       180° (½·2³²)

Mapping Nodes and Data onto the Ring

Assume three servers —Node A, Node B and Node C.

Placing nodes

ring_nodes

Node Hash angle Covers hash interval*
Node A (240° → 0°]
Node B 120° (0° → 120°]
Node C 240° (120° → 240°]

Interval is open at the start and closed at the end, so each key maps to exactly one node.

Placing data keys

Key Hash angle Stored on…
K1 100° Node B
K2 200° Node C
K3 330° Node A (0°) – wraps past 360°

ring_nodes_data

Adding and Removing Nodes

A major advantage of consistent hashing is that only the keys that lie in the affected arc move when the node set changes.

Adding Node D at 80°

ring_add_node_d

Removing Node B

ring_remove_node_b

Virtual Nodes (VNodes)

To smooth out load in clusters with uneven node counts or heterogeneous hardware, each physical server is hashed many times, producing virtual nodes scattered around the ring. Requests are balanced across those VNodes, so even if one physical machine is temporarily overloaded only a small slice of the hash‑space suffers.

Practical Example: Distributed Caching with Consistent Hashing

Consider a web application using a distributed cache to store session data. Let's see how consistent hashing helps in this scenario.

Without Consistent Hashing

With Consistent Hashing

Virtual Nodes (VNodes)

To enhance data distribution and fault tolerance, consistent hashing often uses virtual nodes.

What Are Virtual Nodes?

Benefits of Virtual Nodes

Implementing Consistent Hashing

Let's walk through how you might implement consistent hashing in practice.

Step 1: Hash Function Selection

Choose a hash function that distributes values uniformly, such as MD5 or SHA-1.

Step 2: Mapping Nodes to the Ring

Assign each node (or virtual node) a hash value to determine its position on the ring.

Step 3: Mapping Keys to Nodes

For each data key:

  1. Compute its hash value.
  2. Locate the first node on the ring whose hash value is greater than or equal to the key's hash.
  3. If no such node exists (the key's hash is higher than any node's hash), wrap around to the first node on the ring.

Sample Code Snippet (Pseudocode)

# Import a reliable hash function
import hashlib

# Function to compute hash value
def compute_hash(key):
    return int(hashlib.sha1(key.encode()).hexdigest(), 16)

# Nodes in the system
nodes = ['NodeA', 'NodeB', 'NodeC']

# Create the ring with virtual nodes
ring = {}
virtual_nodes = 100  # Number of virtual nodes per physical node

for node in nodes:
    for i in range(virtual_nodes):
        vnode_key = f"{node}:{i}"
        hash_val = compute_hash(vnode_key)
        ring[hash_val] = node

# Sort the hash ring
sorted_hashes = sorted(ring.keys())

# Function to find the node for a given key
def get_node(key):
    hash_val = compute_hash(key)
    for node_hash in sorted_hashes:
        if hash_val <= node_hash:
            return ring[node_hash]
    return ring[sorted_hashes[0]]  # Wrap around

# Example usage
key = 'my_data_key'
assigned_node = get_node(key)
print(f"Key '{key}' is assigned to {assigned_node}")

Real‑World Uses & Challenges

System Why uses consistent hashing Notable wrinkles
Amazon Dynamo / DynamoDB On‑line shopping carts must survive node loss during Black Friday Requires anti‑entropy sync to handle “sloppy” quorums
Apache Cassandra Ring topology underpins token ranges for partitions Hot partitions can appear if token assignment is careless
Memcached client libraries Keeps cache‑hit rate high during live scaling Need client‑side agreement on hash function
CDNs (e.g., Akamai) Predictable client → edge‑server mapping Must integrate geo‑routing and health‑checks

Implementation challenges include choosing a non‑biased hash function, deciding how many VNodes per server, and coordinating ring state changes across clients.

Advantages of Consistent Hashing

Challenges and Considerations

Best Practices

Table of Contents

    Consistent Hashing
    1. The Hash Ring Concept
      1. Hash-ring overview
    2. Mapping Nodes and Data onto the Ring
      1. Placing nodes
      2. Placing data keys
    3. Adding and Removing Nodes
      1. Adding Node D at 80°
      2. Removing Node B
    4. Virtual Nodes (VNodes)
    5. Practical Example: Distributed Caching with Consistent Hashing
      1. Without Consistent Hashing
      2. With Consistent Hashing
    6. Virtual Nodes (VNodes)
      1. What Are Virtual Nodes?
      2. Benefits of Virtual Nodes
    7. Implementing Consistent Hashing
      1. Step 1: Hash Function Selection
      2. Step 2: Mapping Nodes to the Ring
      3. Step 3: Mapping Keys to Nodes
      4. Sample Code Snippet (Pseudocode)
    8. Real‑World Uses & Challenges
    9. Advantages of Consistent Hashing
    10. Challenges and Considerations
    11. Best Practices