Consistent Hashing: Data Distribution Across Systems

Learn how consistent hashing works in caches, databases, and CDNs, including hash rings, virtual nodes, node additions, and redistribution strategies.

published: reading time: 27 min read

Consistent Hashing: Distributing Data Across Distributed Systems

Regular hashing breaks when nodes join or leave. With simple modulo hashing, adding a server remaps nearly every key. Consistent hashing solves this. When nodes join or leave, only neighboring keys remap.

I first ran into consistent hashing in Memcached clients. The docs warned against using regular hash % server_count distribution. They had a point. Our cache hit rate dropped every time we added a server, until we switched to consistent hashing.

The Problem with Regular Hashing

Regular hash distribution uses a simple formula: server = hash(key) % num_servers. This works until you add or remove a server.

# Regular hashing distribution
num_servers = 4
servers = ['server-1', 'server-2', 'server-3', 'server-4']

for key in ['user:1', 'user:2', 'user:3', 'user:4']:
    server_idx = hash(key) % num_servers
    print(f"{key} -> {servers[server_idx]}")

# Adding a server changes everything
num_servers = 5
for key in ['user:1', 'user:2', 'user:3', 'user:4']:
    server_idx = hash(key) % num_servers
    print(f"{key} -> {servers[server_idx]}")  # Most keys map differently

With 4 servers, key user:1 might map to server-1. With 5 servers, it could end up on server-3 instead. Almost every key remaps to a different server. Cache miss rates spike. Database load follows.

This causes real problems in production. Rolling out a new cache server triggers a wave of cache misses. Every miss hits the database. Databases often cannot handle this sudden load.

How Consistent Hashing Works

Consistent hashing arranges both keys and servers on a hash ring. Keys map to the first server clockwise on the ring.

graph TD
    Ring[Hash Ring] -->|hash=50| K1[Key A]
    Ring -->|hash=150| K2[Key B]
    Ring -->|hash=250| K3[Key C]
    Ring -->|hash=75| S1[Server 1: 50-150]
    Ring -->|hash=175| S2[Server 2: 151-250]
    Ring -->|hash=275| S3[Server 3: 251-50]
    K1 --> S1
    K2 --> S2
    K3 --> S3

In this example, Key A (hash 50) maps to Server 1. Key B (hash 150) maps to Server 2. Key C (hash 250) wraps around to Server 3.

When you add Server 4 between Server 1 and Server 2, only keys between them remap. Keys hash 50-100 move from Server 1 to Server 4. Keys hash 100-150 stay on Server 1. Most keys are unaffected.

class ConsistentHash:
    def __init__(self, nodes=None):
        self.ring = {}
        self.sorted_keys = []
        if nodes:
            for node in nodes:
                self.add_node(node)

    def add_node(self, node):
        key = self.hash(node)
        self.ring[key] = node
        self.sorted_keys.append(key)
        self.sorted_keys.sort()

    def get_node(self, key):
        hash_val = self.hash(key)
        for node_key in self.sorted_keys:
            if node_key >= hash_val:
                return self.ring[node_key]
        # Wrap around to first node
        return self.ring[self.sorted_keys[0]]

Virtual Nodes

Basic consistent hashing has a distribution problem. With few servers, keys do not distribute evenly. One server might get twice as many keys as another.

Virtual nodes solve this. Each physical server appears multiple times on the ring. This smooths out the distribution.

graph TD
    Ring[Hash Ring] -->|vn:10| S1[Server 1]
    Ring -->|vn:10| S2[Server 2]
    Ring -->|vn:10| S3[Server 3]
    S1a[Server 1: hash 25] --> Ring
    S1b[Server 1: hash 125] --> Ring
    S1c[Server 1: hash 225] --> Ring

With 3 physical servers and 100 virtual nodes each, the ring has 300 points. Each server handles roughly 1/3 of the keys. Adding a server adds 100 points and takes 1/4 of the keys from each existing server.

class ConsistentHash:
    def __init__(self, num_virtual=100):
        self.num_virtual = num_virtual
        self.ring = {}
        self.sorted_keys = []

    def add_node(self, node):
        for i in range(self.num_virtual):
            key = self.hash(f"{node}:vn{i}")
            self.ring[key] = node
            self.sorted_keys.append(key)
        self.sorted_keys.sort()

    def remove_node(self, node):
        for i in range(self.num_virtual):
            key = self.hash(f"{node}:vn{i}")
            del self.ring[key]
            self.sorted_keys.remove(key)

Adding and Removing Nodes

When a node joins, it claims its portion of the ring. When a node leaves, its keys are claimed by the next clockwise neighbor.

# Before adding Server 4
# Server 1: keys 0-99, 200-299
# Server 2: keys 100-199
# Server 3: keys 200-255 (wrapped)

# After adding Server 4 (hash 125)
# Server 1: keys 0-99
# Server 4: keys 100-124
# Server 2: keys 125-199
# Server 3: keys 200-255

Only keys between the new node and its predecessor remap. In the example, keys 100-124 move from Server 2 to Server 4. Other keys stay where they are.

With virtual nodes, the redistribution is more granular. Each virtual node claims a small portion of keys. The impact of adding or removing a node is spread evenly.

Applications

Distributed Caches

Memcached clients use consistent hashing. The client knows the cache servers and uses consistent hashing to figure out which server holds each key. Adding cache servers only affects keys that map to the new server.

Redis Cluster uses consistent hashing concepts, though with hash slots rather than a continuous ring. 16384 slots distribute across nodes. Key remapping happens at slot boundaries.

Amazon DynamoDB

DynamoDB uses consistent hashing for data distribution. Virtual nodes map to physical nodes. Data partitions redistribute when capacity changes.

Table partitions in DynamoDB distribute based on partition key hash. Consistent hashing ensures even distribution across partitions.

Content Delivery Networks

CDNs use consistent hashing to direct requests to origin servers. When an origin server fails, only requests for that server’s keys reroute. Other origins keep serving their keys without interruption.

Replication with Consistent Hashing

Consistent hashing extends to replication. A key’s primary replica sits on its assigned node. Additional replicas sit on subsequent nodes on the ring.

def get_replicas(self, key, num_replicas=3):
    replicas = []
    hash_val = self.hash(key)
    for node_key in self.sorted_keys:
        if node_key >= hash_val:
            replicas.append(self.ring[node_key])
            if len(replicas) == num_replicas:
                break
    # Wrap around if needed
    while len(replicas) < num_replicas:
        for node_key in self.sorted_keys:
            if self.ring[node_key] not in replicas:
                replicas.append(self.ring[node_key])
                break
    return replicas

With replication factor 3, each key lives on 3 nodes. If one node fails, other replicas serve the data. The system continues operating without data loss.

Trade-offs

Consistent hashing has costs. The ring structure requires coordination. In distributed systems, nodes must agree on ring state.

Client-side consistent hashing puts complexity in clients. All clients must implement the same hashing logic. Configuration must stay synchronized.

Some systems use centralized routing. A coordinator tracks the ring and routes requests. This simplifies clients but introduces a dependency on the coordinator.

Consistent hashing trades simplicity for resilience. Regular hashing is simpler but breaks when things change. Consistent hashing handles change gracefully.

Comparison: Partitioning Strategies

DimensionConsistent HashingRange PartitioningHash Partitioning (Modulo)
Key DistributionEven across nodesSequential rangesBased on hash mod N
Add/Remove NodesMinimal remapping (neighboring keys only)Entire ranges shiftNearly all keys remap
Range QueriesInefficient (may span multiple nodes)Efficient (keys in range on same node)Inefficient
Hot Spot RiskHigher (adjacent keys on same node)Lower (can split ranges)Low (if hash spreads evenly)
ComplexityMedium-HighLowVery Low
Metadata OverheadRing state per nodePartition mapNone
Use CasesDistributed caches, DHTs, CDNsTime-series data, ordered scansKey-value stores, random access

When each strategy works best:

StrategyBest ForAvoid When
Consistent HashingMulti-node caching, horizontal scaling, replicationKeys need ordering, range scans
Range PartitioningTime-series, lexicographic keys, ordered retrievalHigh cardinality keys, random access
Hash PartitioningSimple key-value, even distribution neededRange queries required, key locality matters

Rendezvous Hashing: An Alternative Approach

Rendezvous hashing (also called highest random weight hashing) is an alternative to consistent hashing that avoids the ring structure entirely. It works by computing a hash score for each key-server pair and picking the server with the highest score.

import hashlib

def rendezvous_hash(key, servers):
    """Pick server with highest hash score for this key."""
    best_server = None
    best_score = -1
    for server in servers:
        score = int(hashlib.sha256(f"{key}:{server}".encode()).hexdigest(), 16)
        if score > best_score:
            best_score = score
            best_server = server
    return best_server

# Example
key = "user:123"
servers = ["cache-1", "cache-2", "cache-3"]
print(rendezvous_hash(key, servers))  # Deterministic, but spreads keys evenly

The key difference from consistent hashing: with rendezvous hashing, adding a server only affects keys that previously hashed to that specific server — not neighboring keys on a ring. No virtual nodes needed.

AspectConsistent HashingRendezvous Hashing
StructureHash ringPairwise hash scoring
Adding N nodesAffects ~1/N neighboring keysAffects keys previously on those N servers
Virtual nodes neededYes (for even distribution)No
Computation per lookupO(log N) binary searchO(N) — must hash all servers
Server-pinned requestsEasy (same ring lookup)Easy (same scoring)

Rendezvous hashing shines when you have a fixed set of servers and want minimal redistribution on changes. Memcached’s “ketama” library uses this approach. The tradeoff is the O(N) lookup cost — for large server pools, consistent hashing’s O(log N) scales better.

Ring Rebalancing: Step by Step

When you add or remove nodes, the ring redistributes keys. Here is the sequence during a node addition:

sequenceDiagram
    participant R as Ring State
    participant N1 as Node 1
    participant N2 as Node 2
    participant N3 as Node 3
    participant N4 as New Node 4

    Note over R: Initial state: N1, N2, N3 on ring
    Note over R: N1 owns [0-100), N2 owns [100-200), N3 owns [200-300)
    N4->>R: Compute position: hash("node4") = 150
    Note over R: N4 inserts at position 150
    R->>N1: N4 added between N1 and N2
    N1->>R: Keys in [100, 150) migrate to N4
    N4->>R: Acknowledged
    Note over R: New state: N1 owns [0-100), N4 owns [100, 150), N2 owns [150-200), N3 owns [200-300)

For controlled rebalancing in production, add nodes gradually. Add one node, wait for the cluster to stabilize and data to redistribute, then add the next. Redis Cluster does this automatically when using redis-cli --cluster add-node with --cluster-slots to gradually migrate slots.

If you need to move a large volume of keys quickly (for maintenance or rebalancing), use a double-linking approach: the old owner continues serving requests while the new owner pulls data in the background. Once caught up, switch the routing atomically.

MurmurHash vs MD5 vs SHA-256: Choosing a Hash Function

The hash function you use affects both distribution quality and performance. Here is how the main options compare:

import hashlib
import mmh3  # MurmurHash3 (install: pip install mmh3)

def benchmark_hash(data, iterations=100000):
    import time

    # MD5
    start = time.time()
    for _ in range(iterations):
        hashlib.md5(data)
    md5_time = time.time() - start

    # SHA-256
    start = time.time()
    for _ in range(iterations):
        hashlib.sha256(data)
    sha_time = time.time() - start

    # MurmurHash3
    start = time.time()
    for _ in range(iterations):
        mmh3.hash(data)
    mmh3_time = time.time() - start

    return md5_time, sha_time, mmh3_time

Typical results on a modern laptop for 100,000 hashes of a 32-byte key:

Hash FunctionSpeed (relative)Output BitsCollision Resistance
MurmurHash310x fastest32 or 128Low (not cryptographic)
MD52x slower128Broken (do not use for security)
SHA-256baseline256High (cryptographic)

Use MurmurHash3 for consistent hashing in production systems — it is fast and distributes keys evenly. The non-cryptographic nature does not matter for load distribution since an attacker cannot control key placement without controlling the hash function input.

Use SHA-256 only if you need cryptographic guarantees (for example, preventing hash collision attacks where a client crafts keys that all hash to the same server). Most production deployments use MMH3 or xxHash for performance.

When to Use and When Not to Use Consistent Hashing

When to Use Consistent Hashing:

  • You need to distribute keys across multiple servers or shards
  • Server additions or removals should not cause massive key remapping
  • You are building a distributed cache, load balancer, or data store
  • You need horizontal scaling with minimal key redistribution
  • You want to achieve uniform key distribution across nodes

When Not to Use Consistent Hashing:

  • You have only a single server (add complexity only when needed)
  • Your system does not change frequently (static distribution is fine)
  • You need ordered key retrieval (consistent hashing does not maintain order)
  • You have strict partitioning requirements (use deterministic partitioning)
  • The operational complexity is not justified for your scale

Production Failure Scenarios

FailureImpactMitigation
Node failureKeys mapped to failed node become unavailableImplement replication, use virtual nodes for graceful degradation
Network partitionRing becomes split, keys route incorrectlyUse quorum reads/writes, implement failure detection
Ring state inconsistencyDifferent clients route to different nodesUse centralized ring state (e.g., ZooKeeper, etcd)
Hot spots on popular keysSingle key overwhelms one nodeImplement key splitting, use replication for popular keys
Virtual node misconfigurationUneven distribution, some nodes overloadedMonitor distribution metrics, validate virtual node count
Client ring cache stalenessClient uses outdated ring configurationImplement cache invalidation, use short TTLs
Hash collision attacksMalicious keys all hash to same pointUse salted hashes, limit key entropy
Addition of many nodes simultaneouslyTemporary redistribution chaosAdd nodes gradually, use controlled rebalancing

Observability Checklist

Metrics to Monitor:

  • Request distribution across nodes (detect hot spots)
  • Cache hit/miss ratio for distributed caches
  • Node availability and health
  • Key distribution balance (variance across nodes)
  • Request latency by node
  • Ring state version/consistency across clients
  • Failed request count due to routing errors
  • Replication lag for replicated keys

Logs to Capture:

  • Node join and leave events
  • Ring state changes
  • Routing decisions for debugging
  • Hot spot detection events
  • Failure detection and failover events
  • Client configuration changes

Alerts to Set:

  • Node becomes unavailable
  • Distribution skew exceeds threshold (e.g., > 20% from mean)
  • Cache hit ratio drops significantly
  • Request routing failures
  • Ring state inconsistency detected
  • Latency spike on specific nodes
# Example: Consistent hashing monitoring
def monitor_distribution(ring):
    distribution = {}
    for key in sample_keys:
        node = ring.get_node(key)
        distribution[node] = distribution.get(node, 0) + 1

    mean = sum(distribution.values()) / len(distribution)
    variance = sum((v - mean) ** 2 for v in distribution.values()) / len(distribution)

    return {
        'distribution': distribution,
        'mean': mean,
        'stddev': variance ** 0.5,
        'skew_pct': (max(distribution.values()) - mean) / mean * 100
    }

Security Checklist

  • Secure ring state storage (ZooKeeper, etcd access controls)
  • Encrypt communication between nodes and clients
  • Authenticate node-to-node communication
  • Implement rate limiting to prevent hash collision attacks
  • Audit ring configuration changes
  • Use secure bootstrapping for new nodes
  • Protect client-to-node connections
  • Monitor for unauthorized ring modifications
  • Use TLS for all inter-node communication

Common Pitfalls and Anti-Patterns

  1. Too few virtual nodes: Insufficient virtual nodes cause uneven distribution. Use at least 100-200 virtual nodes per physical node.

  2. Not handling node failures gracefully: Without replication, failed nodes lose all their keys. Implement replica placement.

  3. Inconsistent ring state across clients: Different clients with different ring views route to wrong nodes. Use centralized ring state or strong synchronization.

  4. Ignoring hash function quality: Poor hash functions cause uneven distribution or predictability. Use cryptographic hashes (SHA-256, MD5).

  5. Not testing redistribution: If you have never tested adding/removing nodes, you do not know the impact. Run chaos tests.

  6. Using consistent hashing when static partitioning suffices: If nodes never change, simple modulo hashing is fine. Do not add complexity unnecessarily.

  7. Forgetting about replication alongside consistent hashing: Consistent hashing handles distribution but not durability. Plan replication separately.

  8. Assuming linear scalability: Adding nodes does not proportionally increase capacity due to virtual node redistribution overhead.

Multi-Dimensional Consistent Hashing

Standard consistent hashing gives you one axis of distribution. That works fine until you need two axes. A geo-distributed database might want data-locality for both user region AND tenant ID. A single hash value cannot capture both.

Multi-dimensional consistent hashing solves this by treating each attribute as its own hash ring. Query each ring separately, then combine the results through weighted voting.

class ConsistentHashRing:
    """Single-dimension consistent hash ring for use in multi-dimensional hashing."""
    def __init__(self, nodes=None, virtual_nodes=100):
        self.virtual_nodes = virtual_nodes
        self.ring = {}
        self.sorted_keys = []
        if nodes:
            for node in nodes:
                self.add_node(node)

    def _hash(self, key):
        import hashlib
        return int(hashlib.md5(str(key).encode()).hexdigest(), 16) % (2**32)

    def add_node(self, node):
        for i in range(self.virtual_nodes):
            key = self._hash(f"{node}:vn{i}")
            self.ring[key] = node
            self.sorted_keys.append(key)
        self.sorted_keys.sort()

    def get_node(self, key):
        if not self.sorted_keys:
            return None
        hash_val = self._hash(key)
        import bisect
        idx = bisect.bisect(self.sorted_keys, hash_val)
        if idx >= len(self.sorted_keys):
            idx = 0
        return self.ring[self.sorted_keys[idx]]


class MultiDimensionalHasher:
    """
    Multi-dimensional consistent hashing.
    Each dimension is an independent hash ring. Nodes are selected
    by combining votes from each dimension with weights.
    """
    def __init__(self, dimensions, nodes=None, virtual_nodes=100):
        """
        Initialize with `dimensions` independent hash rings.
        dimensions: list of dimension names, e.g., ['region', 'tenant']
        """
        self.dimension_names = dimensions
        self.rings = [ConsistentHashRing(nodes, virtual_nodes) for _ in dimensions]
        self.weights = [1.0] * len(dimensions)

    def get_node(self, key_attrs):
        """
        key_attrs: dict mapping dimension name to key value, e.g.,
                   {'region': 'us-east', 'tenant': 'tenant-42'}
        Returns the node with the highest weighted vote across all dimensions.
        """
        if isinstance(key_attrs, dict):
            # If passed as dict, extract values in dimension order
            attrs = [key_attrs.get(dim, '') for dim in self.dimension_names]
        else:
            # If passed as list/tuple, assume same order as dimensions
            attrs = list(key_attrs)

        votes = {}
        for i, attr in enumerate(attrs):
            node = self.rings[i].get_node(attr)
            votes[node] = votes.get(node, 0) + self.weights[i]

        return max(votes, key=votes.get)

    def get_replicas(self, key_attrs, num_replicas=3):
        """Get multiple replica nodes, one per dimension per replica slot."""
        if isinstance(key_attrs, dict):
            attrs = [key_attrs.get(dim, '') for dim in self.dimension_names]
        else:
            attrs = list(key_attrs)

        all_candidates = []
        for i, attr in enumerate(attrs):
            node = self.rings[i].get_node(attr)
            weight = self.weights[i]
            all_candidates.append((node, weight))

        # Sort by weight descending and pick top num_replicas
        all_candidates.sort(key=lambda x: x[1], reverse=True)
        return [c[0] for c in all_candidates[:num_replicas]]


# Example: geo-distributed database with region and tenant dimensions
hasher = MultiDimensionalHasher(
    dimensions=['region', 'tenant'],
    nodes=['node-us-east-1', 'node-us-east-2', 'node-eu-west-1', 'node-ap-south-1'],
    virtual_nodes=50
)

# Keys are routed based on both region AND tenant
key1 = {'region': 'us-east', 'tenant': 'tenant-42'}
key2 = {'region': 'us-east', 'tenant': 'tenant-42'}  # Same tenant, same region
key3 = {'region': 'eu-west', 'tenant': 'tenant-42'}   # Different region

print(hasher.get_node(key1))  # Likely returns node-us-east-1 or node-us-east-2
print(hasher.get_node(key2))  # Same result as key1 (deterministic)
print(hasher.get_node(key3))  # Likely returns node-eu-west-1

How the voting works:

  1. For each dimension, hash the attribute value and find the winning node on that dimension’s ring
  2. Tally votes: each dimension contributes its winner with that dimension’s weight
  3. Return the node with the highest total vote

The result is data-locality along multiple axes. Tenants in the same region end up on regional nodes, but different tenants in that region still distribute across the nodes in it.

Use case: geo-distributed databases. Systems like DynamoDB Global Tables or Cassandra with multi-DC replication benefit from multi-dimensional hashing. You want reads to hit the nearest replica for latency, but you also want tenant-level isolation. By hashing tenant_id on one dimension and using proximity-based replica selection on another, you get both properties.

Trade-offs:

AspectSingle-Dimension CHMulti-Dimensional CH
Routing logicO(log N) per dimensionO(D * log N) where D = dimensions
Data localityOne axis onlyMultiple axes
ImplementationSimplerMore complex
DebuggingEasier to traceMust track multiple rings

Handling Hash Collisions Gracefully

Hash functions map arbitrary input to fixed-size output. With a large enough key space, two different keys can produce the same hash value. This is a collision. In consistent hashing, collisions create ambiguity: which node owns the key?

Most consistent hashing implementations use 32-bit or 64-bit hash values. With billions of keys, collisions are not theoretical — they happen in production.

How Collisions Manifest

When two keys hash to the same position on the ring, the first node clockwise from that position handles both keys. This is correct behavior, but it creates uneven distribution if many keys collide at the same point.

import hashlib

def simple_hash(key):
    """Standard hash - collisions happen."""
    return int(hashlib.md5(str(key).encode()).hexdigest(), 16) % (2**32)

# Different keys, same hash
key1 = "user:1234567"
key2 = "user:2345678"
hash1 = simple_hash(key1)
hash2 = simple_hash(key2)

print(f"Key 1 hash: {hash1}")
print(f"Key 2 hash: {hash2}")
print(f"Same? {hash1 == hash2}")  # Can be True!

Consistent Collision Handling with Double Hashing

The standard approach is double hashing: use a secondary hash function when the first produces a collision.

import hashlib
import mmh3  # MurmurHash3

class ConsistentHashWithCollisionHandling:
    def __init__(self, nodes=None, virtual_nodes=100):
        self.virtual_nodes = virtual_nodes
        self.ring = {}
        self.sorted_keys = []
        if nodes:
            for node in nodes:
                self.add_node(node)

    def _hash_primary(self, key):
        """Primary hash function - use for ring position."""
        return mmh3.hash(str(key)) % (2**32)

    def _hash_secondary(self, key):
        """Secondary hash function - used for collision resolution."""
        # Use a different algorithm and salt
        return int(hashlib.sha256(f"{key}:secondary".encode()).hexdigest(), 16) % (2**32)

    def add_node(self, node):
        for i in range(self.virtual_nodes):
            key = mmh3.hash(f"{node}:vn{i}") % (2**32)
            self.ring[key] = node
            self.sorted_keys.append(key)
        self.sorted_keys.sort()

    def get_node(self, key):
        if not self.sorted_keys:
            return None

        hash_val = self._hash_primary(key)

        # Find first node >= hash
        for node_key in self.sorted_keys:
            if node_key >= hash_val:
                return self.ring[node_key]

        # Collision: use secondary hash to find alternative position
        alt_hash = self._hash_secondary(key)
        for node_key in self.sorted_keys:
            if node_key >= alt_hash:
                return self.ring[node_key]

        # If still collided (extremely rare), use first node
        return self.ring[self.sorted_keys[0]]

    def get_node_with_stats(self, key):
        """Get node and track collision statistics."""
        hash_val = self._hash_primary(key)
        alt_hash = self._hash_secondary(key)

        # Check for collision
        if hash_val == alt_hash:
            return self.get_node(key), True
        return self.get_node(key), False

Trade-offs of Double Hashing

AspectSingle HashDouble Hashing
Collision probabilityHigher (birthday paradox)Lower (two independent hashes)
Lookup speedO(log N)O(log N) or O(2*log N) worst case
ImplementationSimplerSlightly more complex
Memory overheadNoneNone
Distribution qualityMay degrade with many collisionsMore even distribution

Double hashing adds minimal overhead. The secondary hash is only consulted when two keys share a primary hash, which is rare for well-chosen hash functions. The O(log N) lookup stays the same — you just traverse the ring twice in the worst case.

Production Considerations

Monitor collision rates. Track how often the secondary hash is used. A rate above 0.1% suggests your hash function has poor distribution for your key set.

Use independent hash families. The secondary hash should use a different algorithm than the primary. If both use MD5, they may share collision patterns. Mixing MD5 with MurmurHash3 gives you independent distributions.

Consider bounded collision domains. If your keys have known structure (e.g., user:ID where ID is numeric), ensure your hash function distributes that specific pattern well. Some hash functions perform poorly on sequential or low-entropy keys.

Jump hashing is an alternative worth knowing about. Google’s Jump Consistent Hashing handles collisions implicitly through a deterministic remapping algorithm. It avoids the ring structure entirely and handles node additions without remapping all keys. It works well for up to hundreds of nodes but is not as widely adopted as classic consistent hashing.

def jump_hash(key, num_buckets):
    """
    Google's Jump Consistent Hashing.
    Handles collisions implicitly - no double hashing needed.
    """
    var hash = key
    var b = -1
    var j = 0

    while j < num_buckets:
        b = j
        hash = (hash * 2862933555777941757 + 1) % (2**64)
        j = float(hash) / ((2**64 - 1) / (j + 1))

    return b

Jump hashing handles the birthday paradox differently — it assigns keys to buckets deterministically without a ring structure. For systems adding and removing nodes frequently, this avoids rebalancing complexity entirely.

Real-World Case Studies

Amazon DynamoDB: Consistent Hashing with Partition Management

DynamoDB uses consistent hashing but with a twist: it pre-splits the hash space into 16384 hash slots and distributes them across partitions. Each partition is served by multiple nodes for replication. When you provision throughput, DynamoDB allocates capacity across partitions — not across a raw consistent hash ring.

The practical lesson: pure consistent hashing handles distribution well but does not handle hot spots. DynamoDB’s partition key architecture means that a single partition can handle roughly 1000 WCUs and 3000 RCUs. If your access pattern sends all traffic to one partition key (say, user_id = “alice” for every request), you saturate one partition regardless of your total provisioned throughput. The fix is adding a random suffix to partition keys — this distributes load across multiple partitions.

Dynamo’s 2017 rearchitecture also introduced DynamoDB Streams, which tracks item-level changes with sequence numbers. This enabled read-your-writes consistency within a session by tracking the last sequence number read.

Cassandra: Ketama and the Memcached Precedent

Cassandra’s original hashing strategy used a derivative of the ketama algorithm — Memcached’s consistent hashing implementation. The ketama approach uses virtual nodes (100 per physical node) and MD5 hashing, and was designed specifically to avoid the “thundering herd” problem when nodes are added or removed.

The known gotcha: Cassandra’s initial implementation had uneven distribution for small clusters because the ketama weights were not properly calibrated below 8 nodes. This was fixed in later versions, but it meant that 3-node Cassandra clusters could have 30-40% distribution skew. The operational lesson: validate your distribution metrics on any consistent hashing cluster, especially when node counts are low.

Memcached: The Client-Side Hashing Problem

Memcached itself has no server-side consistent hashing — the client decides which server to use. This means every client library must implement consistent hashing independently. This caused fragmentation: some clients used ketama, others used simple modulo, others used rendezvous hashing.

The operational failure mode: rolling out a new Memcached client version that changed the hashing algorithm. All keys that hashed differently would miss, flooding the database. This happened at several companies including Reddit — a client update caused cache miss rates to spike, which translated directly into database CPU spikes.

The mitigation: hash ring versioning. When introducing a new hash algorithm, run both the old and new ring simultaneously for a migration window, and drain keys from the old ring before decommissioning it.

Interview Questions

Q: You have a distributed cache with 10 nodes using consistent hashing and 100 virtual nodes per physical node. A node fails. How many keys, approximately, need to be remapped?

A: With 10 physical nodes and 100 virtual nodes each, there are 1000 points on the ring. Each physical node owns roughly 1/10 of the ring. When one node fails, its 100 virtual nodes are removed, and keys that previously mapped to those points are redistributed to the next clockwise nodes. In total, approximately 1/10 of keys (10%) would need to remap — specifically, the keys that were mapped to the 100 virtual node points of the failed node.

Q: What is the difference between consistent hashing and rendezvous hashing? When would you choose rendezvous over consistent hashing?

A: Rendezvous hashing (highest random weight) computes a score for each key-server pair and picks the server with the highest score. It avoids the ring structure entirely and guarantees that adding N servers only affects keys that were previously mapped to those N servers — not neighboring keys. Consistent hashing’s advantage is O(log N) lookup versus O(N) for rendezvous. You would choose rendezvous when you have a fixed server set, need minimal redistribution on changes, and can afford the O(N) lookup cost, or when you need deterministic behavior without a distributed ring state.

Q: Virtual nodes solve a specific problem in consistent hashing. Explain the problem and the trade-off introduced.

A: Virtual nodes solve the uneven distribution problem when physical node counts are small. Without virtual nodes, a 3-node ring has only 3 hash points — some nodes might end up with significantly more keys than others. With 100 virtual nodes per physical node, the law of large numbers smooths the distribution. The trade-off is increased metadata overhead: instead of storing N nodes, you store N×V virtual node entries. Lookup is still O(log N) if using a sorted ring, but the ring structure is larger. Additionally, when a node fails, the redistribution is more granular but affects more keys per virtual node failure.

Q: Your distributed cache hit rate dropped from 95% to 60% after a routine deployment that added two new cache nodes. What happened and how would you diagnose it?

A: The most likely cause is that the new nodes were added to the consistent hash ring and immediately started receiving requests for their hash ranges, but the data was not pre-warmed on them — so cold cache. The theoretical 1/N cache miss rate increase would be modest (from 1/10 to 1/12 ≈ 8% miss rate increase), not 35 percentage points. Possible additional causes: the new nodes have different configuration causing connection issues, the hash algorithm changed, or the node addition triggered a rebalancing that inadvertently invalidated a large portion of existing cached data. Diagnosis: compare cache key distribution before and after, check client logs for connection errors, verify the hash ring state on the clients, and check whether pre-warming ran after the deployment.

Q: A system uses consistent hashing for data distribution across 5 nodes with replication factor 3. One node is down for 30 minutes. Describe what happens to reads and writes during this window.

A: For writes with replication factor 3, the coordinator sends the write to the primary replica and two additional nodes in the ring. If one of the replicas is on the failed node, the write can still succeed as long as the coordinator receives acknowledgments from the remaining two replicas (depending on consistency level — ONE, QUORUM, or ALL). For reads, if the consistency level is ONE and the replica contacted is stale, you get a stale read. If using QUORUM, reads quorum with the available replicas and may still return fresh data. The key risk: if the failed node held the primary replica for a significant number of keys, and the remaining nodes cannot reach quorum, writes may be unavailable. The anti-entropy process catches up after the node recovers, but the 30-minute window means up to 30 minutes of writes may need replay.

Quick Recap

Key Bullets:

  • Consistent hashing minimizes key redistribution when nodes join or leave
  • Virtual nodes smooth out uneven distribution across the ring
  • Keys map to the first node clockwise on the ring
  • Replication placement extends consistent hashing for durability
  • Ring state consistency across clients is critical for correct routing
  • Monitor distribution balance to detect hot spots and configuration errors

Copy/Paste Checklist:

# Minimal consistent hashing implementation with monitoring
import hashlib
import bisect

class ConsistentHash:
    def __init__(self, nodes=None, virtual_nodes=100):
        self.virtual_nodes = virtual_nodes
        self.ring = {}
        self.sorted_keys = []

        if nodes:
            for node in nodes:
                self.add_node(node)

    def _hash(self, key):
        return int(hashlib.md5(str(key).encode()).hexdigest(), 16) % (2**32)

    def add_node(self, node):
        for i in range(self.virtual_nodes):
            key = self._hash(f"{node}:vn{i}")
            self.ring[key] = node
            bisect.insort(self.sorted_keys, key)

    def remove_node(self, node):
        for i in range(self.virtual_nodes):
            key = self._hash(f"{node}:vn{i}")
            self.ring.pop(key)
            self.sorted_keys.remove(key)

    def get_node(self, key):
        if not self.sorted_keys:
            return None
        hash_val = self._hash(key)
        idx = bisect.bisect(self.sorted_keys, hash_val)
        if idx >= len(self.sorted_keys):
            idx = 0
        return self.ring[self.sorted_keys[idx]]

# Usage
ch = ConsistentHash(['node1', 'node2', 'node3'], virtual_nodes=200)
print(ch.get_node('user:123'))  # Returns which node handles this key
  • Distributed Systems Roadmap - Consistent hashing is a foundational concept for distributed systems, enabling data partitioning, leader election, and replication across node clusters without requiring global coordination

Conclusion

Consistent hashing solves the distribution problem in large-scale systems. When nodes join or leave, only neighboring keys remap. Virtual nodes smooth out uneven distribution.

You will find this pattern in distributed caches, databases, and CDNs. Memcached clients, DynamoDB, and Cassandra all use variations of consistent hashing.

For related reading, see Horizontal Sharding for database sharding strategies, and Database Scaling for scaling approaches.

Category

Related Posts

Cache Stampede Prevention: Protecting Your Cache

Learn how single-flight, request coalescing, and probabilistic early expiration prevent cache stampedes that can overwhelm your database.

#cache #cache-stampede #performance

Microservices vs Monolith: Choosing the Right Architecture

Understand the fundamental differences between monolithic and microservices architectures, their trade-offs, and how to decide which approach fits your project.

#microservices #monolith #architecture

Asynchronous Replication: Speed and Availability at Scale

Learn how asynchronous replication works in distributed databases, including eventual consistency implications, lag monitoring, and practical use cases where speed outweighs strict consistency.

#distributed-systems #replication #eventual-consistency