Skip Lists: Layered Linked Lists for Fast Search

Understand skip lists as probabilistic alternatives to balanced trees, providing O(log n) search with simple implementation and lock-free variants.

published: April 7, 2026 reading time: 16 min read author: GeekWorkBench

Skip Lists: Layered Linked Lists for Fast Search

Skip lists are a probabilistic data structure that gives you O(log n) search, insertion, and deletion—comparable to balanced trees—but with dramatically simpler implementation. The idea is brilliant: layer multiple linked lists on top of each other, where higher layers act as “express lanes” that skip over many elements in the lower layers. Searching starts from the top layer and descends until finding the target or reaching the bottom.

The magic is in the probabilistic layer assignment. Each element has a 50% chance of appearing in level 1, 25% chance of appearing in level 2, 12.5% in level 3, and so on. This gives you roughly log₂(n) layers on average, with O(log n) expected search time. Because there’s no rebalancing logic (unlike AVL or Red-Black trees), concurrent implementations are far simpler—useful for high-throughput databases like Redis, which uses skip lists for sorted set operations.

Introduction

Skip lists are a probabilistic alternative to balanced trees that delivers O(log n) search, insertion, and deletion—comparable to AVL or Red-Black trees—but with simpler implementation. The idea is to layer multiple linked lists on top of each other, where higher layers act as express lanes that skip over many elements in the lower layers. Searching starts from the top layer and descends until finding the target or reaching the bottom.

The layer distribution comes from a geometric random process. Each element has a 50% chance of appearing in level 1, 25% in level 2, 12.5% in level 3, and so on. This guarantees roughly log₂(n) layers on average. Since there’s no rebalancing logic—no rotations to perform when inserts disrupt the balance—concurrent implementations are far simpler than their tree counterparts. This makes skip lists valuable in high-throughput systems like Redis, which uses them for sorted set operations.

This guide covers the search, insertion, and deletion algorithms with full implementations, explains the probabilistic analysis behind O(log n) expected performance, and discusses the practical advantages over balanced trees. You’ll learn when skip lists make more sense than trees (concurrent access, simple implementation, memory locality) and when their probabilistic guarantees might not be sufficient (systems requiring deterministic worst-case bounds).

When to Use

Skip lists excel when:

Need sorted iteration — Unlike hash tables, maintains order
Concurrent access — Lock-free variants exist; simpler than tree write locks
Range queries — Efficiently iterate over sorted ranges
Simple implementation — Half the code of balanced trees with equivalent performance
Insertions interleaved with traversals — No rebalancing disruption

When Not to Use

When deterministic O(log n) is required (probabilistic structures have non-zero chance of degrade)
Memory is extremely constrained (skip lists use 2× pointers on average per node)
When you need more than simple ordering (trees can be augmented with subtree metadata)

Architecture: Multi-Layer Express Lanes

graph LR
    subgraph L3["Level 3 (fastest)"]
        A3["1"] --> D3["10"]
    end
    subgraph L2["Level 2"]
        A2["1"] --> B2["3"] --> D2["10"]
    end
    subgraph L1["Level 1 (base)"]
        A1["1"] --> B1["3"] --> C1["5"] --> D1["10"]
    end
    subgraph L0["Bottom (all elements)"]
        A0["1"] --> B0["3"] --> C0["5"] --> D0["7"] --> E0["10"] --> F0["12"]
    end

Search from top level, move horizontally until overshoot, then descend. Higher levels skip more elements.

Implementation

Basic Skip List

import random

class SkipListNode:
    def __init__(self, value, max_level=16):
        self.value = value
        self.forward = [None] * max_level


class SkipList:
    """
    Skip list with O(log n) search, insert, delete.
    Probability p = 0.5 for level promotion.
    """

    def __init__(self, max_level=16, p=0.5):
        self.max_level = max_level
        self.p = p
        self.header = SkipListNode(float('-inf'), max_level)
        self.level = 0  # Current max level

    def _random_level(self):
        """Generate random level with geometric distribution."""
        level = 0
        while random.random() < self.p and level < self.max_level - 1:
            level += 1
        return level

    def search(self, target):
        """
        Find element or return None.
        Time: O(log n) expected
        """
        current = self.header
        update = [None] * self.max_level

        # Start from top level, descend
        for i in range(self.level, -1, -1):
            while current.forward[i] and current.forward[i].value < target:
                current = current.forward[i]
            update[i] = current

        current = current.forward[0]

        if current and current.value == target:
            return current.value
        return None

    def insert(self, value):
        """
        Insert element into skip list.
        Time: O(log n) expected
        """
        update = [None] * self.max_level
        current = self.header

        for i in range(self.level, -1, -1):
            while current.forward[i] and current.forward[i].value < value:
                current = current.forward[i]
            update[i] = current

        current = current.forward[0]

        if current is None or current.value != value:
            new_level = self._random_level()

            if new_level > self.level:
                for i in range(self.level + 1, new_level + 1):
                    update[i] = self.header
                self.level = new_level

            new_node = SkipListNode(value, self.max_level)

            for i in range(new_level + 1):
                new_node.forward[i] = update[i].forward[i]
                update[i].forward[i] = new_node

    def delete(self, value):
        """
        Delete element from skip list.
        Time: O(log n) expected
        """
        update = [None] * self.max_level
        current = self.header

        for i in range(self.level, -1, -1):
            while current.forward[i] and current.forward[i].value < value:
                current = current.forward[i]
            update[i] = current

        current = current.forward[0]

        if current and current.value == value:
            for i in range(self.level + 1):
                if update[i].forward[i] == current:
                    update[i].forward[i] = current.forward[i]

            while self.level > 0 and self.header.forward[self.level] is None:
                self.level -= 1

            return True
        return False

    def display(self):
        """Visualize skip list levels."""
        for i in range(self.level, -1, -1):
            line = f"Level {i}: "
            node = self.header.forward[i]
            while node:
                line += f"[{node.value}] -> "
                node = node.forward[i]
            print(line + "None")

Production Failure Scenarios

Unbalanced structure from bad randomness: Every skip list implementation eventually hits a streak of bad random levels. I’ve seen production systems where a faulty RNG seed caused search times to spike 10× above normal. Use a proper seed source (time-based or CSPRNG at startup), not whatever default your language gives you.
Memory overhead from high max_level: Each node stores max_level forward pointers. Set max_level too high relative to your actual element count and you’re burning memory for no benefit. A good rule: max_level = log₂(expected_max_elements) + 3 gives you breathing room without waste.
Worst-case O(n) performance under adversarial input: The “expected” in O(log n) expected is doing a lot of work. An attacker who knows your random level generation can craft insertions that produce pathological structures. If you’re in an adversarial context (network-facing), use a CSPRNG and add level jitter.

Common Pitfalls / Anti-Patterns

Confusing level indexing — Level 0 is the base list with all elements
Off-by-one in random level — Should be < max_level - 1 in while condition
Not updating level after delete — Must check if top level became empty
Using fixed random seed in production — May cause predictable patterns under attack

Observability Checklist

Track skip list operations to catch probabilistic degradation and correctness issues.

Core Metrics

Search operation count and latency per call
Insert/delete operation count
Average level vs expected log₂(n) — if this drifts above 2×, something’s wrong with level generation
Level distribution across all nodes — watch for too many nodes at level 0
Current level count vs max_level — hitting max_level constantly means you need to raise it

Health Signals

Search latency climbing: your list might be unbalanced or your random source might be broken
Average level above 2× log₂(n): random level generation is biased or under attack
Level count hitting max_level frequently: either traffic is higher than expected or an attacker is probing
Delete operations not reducing level: this is a bug in your level maintenance logic
Search returning wrong elements: memory corruption or a concurrent write race

Alerting Thresholds

Search latency > 2× log₂(n) consistently: probabilistic degradation happening
Level distribution skew > 3σ from expected: random generator issue or deliberate attack
Level count == max_level for extended periods: capacity planning needed
Any null pointer in forward chain during traversal: memory corruption — page it immediately

Distributed Tracing

Trace search with level visited count and comparisons per level
Include level distribution in span metadata — you’ll want this when debugging slow searches
Track insert/delete with new level assigned
Correlate slow searches with specific level distributions

Trade-off Analysis

Aspect	Skip Lists	Balanced Trees (AVL/Red-Black)	B-Trees
Search Time	O(log n) expected	O(log n) guaranteed	O(log n) guaranteed
Insert/Delete	O(log n) expected	O(log n) + rebalancing	O(log n) + page splits
Implementation	~200 lines	~500 lines	~400 lines
Concurrent Access	Lock-free CAS possible	Complex tree rotations	Page-level locking
Memory Overhead	2× pointers/node avg	2-3 pointers/node	Pointer + page metadata
Worst-case	O(n) with bad RNG	O(log n) always	O(log n) always
Cache Friendliness	Poor (pointer chasing)	Moderate	Excellent (sequential page access)
Range Queries	O(log n + k)	O(log n + k)	O(log n + k)

When to prefer each:

Skip lists: In-memory, simple concurrent access, rapid iteration
Balanced trees: Guaranteed performance, simpler disk layout
B-trees: Disk-based storage, page-oriented access patterns

Security Notes

Random seed manipulation: If an attacker controls or predicts your random seed, they can force your skip list into worst-case behavior. I once traced a latency spike to an adversarial client who had reverse-engineered our level-0-only attack pattern. The fix was switching to a CSPRNG seeded at process start — don’t use language defaults in security-sensitive paths.

Level 0 targeted attacks: Someone watching your list (through timing side channels or memory access patterns) can infer your level distribution and craft insertions that bury specific elements at level 0, inflating search time for targets they care about. Add padding to level generation so observations don’t reveal structure.

Concurrent modification races: In concurrent environments, a search can cross a level while another thread is updating pointers at that level, seeing partially constructed nodes or stale forward pointers. Naive locking helps but doesn’t fully solve it — use lock-free CAS operations and node versioning so readers can detect and retry when they see an in-progress update.

Quick Recap Checklist

Level 0 contains all elements (like base linked list)
Higher levels skip elements in lower levels
Random level follows geometric distribution (p^k probability)
Search descends from top level, moves horizontally, repeats
O(log n) is expected (probabilistic), not guaranteed

Interview Questions

1. Why is skip list O(log n) expected time?

By design, approximately n/2 elements exist at level 0, n/4 at level 1, n/8 at level 2, and so on—halving at each level. This creates log₂(n) levels on average. Searching at each level traverses elements at that level until overshooting the target, then descends. The random level generation using geometric distribution (each level has probability p of reaching next level) maintains this structure with high probability.

2. Why does Redis use skip lists instead of balanced trees?

Redis needs sorted sets with range operations. Skip lists provide O(log n) operations with simpler concurrent implementation than balanced trees. Tree rotations during rebalancing require complex locking. Skip list insertions only modify neighboring pointers—parts can be locked independently without restructuring the entire structure. This makes lock-free skip list variants practical for high-throughput scenarios.

3. How does skip list compare to B-trees for disk-based storage?

Skip lists map well to disk with locality of reference—elements at higher levels can be stored in sequential blocks like B-trees. The difference is in traversal: skip lists scan horizontally within a level, while B-trees traverse vertically through fixed-size node pages. For in-memory use, skip lists have simpler code; for disk-based systems, B-trees are often preferred due to predictable access patterns and better caching behavior.

4. What happens during a search operation in a skip list?

Expected answer points:

Start at header node at highest level
Move horizontally while next node's value < target
When next node overshoots, descend one level
Repeat until reaching level 0, then check forward pointer
Return target if found, else null

5. Why is random level generation using geometric distribution important?

Expected answer points:

Ensures ~50% nodes at level 0, ~25% at level 1, etc.
Creates log₂(n) levels on average naturally
No manual rebalancing needed unlike trees
Probability p=0.5 balances height vs memory

6. What is the worst-case scenario for skip list performance?

Expected answer points:

O(n) when all nodes happen to be at level 0
Caused by bad RNG or adversarial insertion
Probability is astronomically low: (1/2)^n
Mitigated by CSPRNG and level jitter in production

7. How do lock-free skip lists work?

Expected answer points:

Use CAS (Compare-And-Swap) for pointer updates
Mark nodes as logically deleted before physical removal
Readers traverse through logically deleted nodes
Node versioning detects concurrent modifications
More scalable than coarse-grained tree locks

8. What metrics should you monitor in production skip list implementations?

Expected answer points:

Average level vs expected log₂(n)
Search latency distribution
Level distribution skew
Current level count vs max_level
Memory usage per node

9. How would you implement a skip list from scratch?

Expected answer points:

Define node structure with value and forward pointer array
Implement random_level() using geometric distribution (while random() < p)
Search finds predecessor nodes at each level, stores in update array
Insert creates new node, updates forward pointers at each level
Delete unlinks node and updates pointers, then adjusts level

10. What is the memory overhead of a skip list compared to arrays and linked lists?

Expected answer points:

Each node stores multiple forward pointers (max_level on average 2-4)
Average ~2× pointers per node vs 1-2 for simple linked list
Worst case: O(n log n) total pointers, but expected O(n log n) overall
More compact than balanced trees (no parent/size metadata needed)
Trade-off: memory for speed (log n search vs O(n) scan)

11. Can skip lists be used for write-heavy workloads?

Expected answer points:

Yes, O(log n) insertion and deletion are efficient
No rebalancing overhead unlike balanced trees
Lock-free variants scale well under concurrent writes
Random level generation adds slight CPU overhead
Better than B-trees for workloads with many modifications

12. How does the choice of probability p affect skip list performance?

Expected answer points:

p=0.5 is standard (50% chance of reaching next level)
Higher p (e.g., 0.75) creates taller lists, faster search but more memory
Lower p (e.g., 0.25) creates flatter lists, slower search but less memory
p controls the geometric distribution of levels
Most implementations use p=0.5 as the sweet spot

13. What is the relationship between max_level and performance?

Expected answer points:

max_level limits the height of the skip list structure
Higher max_level allows more levels, potentially faster search
But each node allocates max_level pointers regardless of actual level
Set max_level = log₂(expected_max_elements) + 3 for safety
Hitting max_level constantly means you need to increase it

14. How would you handle range queries in a skip list?

Expected answer points:

Search for the start element using normal skip list search
Once found, traverse level 0 forward collecting elements
Stop when reaching end of range or end of list
Skip lists are efficient for range queries: O(log n + k)
Better than hash maps for ordered range operations

15. What happens during concurrent insertions in a skip list?

Expected answer points:

Multiple threads may try to update the same level's forward pointers
Without synchronization, pointers can become corrupted
Lock-free approaches use CAS to atomically update pointers
Logical deletion marks nodes before physical removal
Proper ordering prevents readers from seeing inconsistent state

16. How does skip list compare to hash maps for key-value storage?

Expected answer points:

Hash maps: O(1) average lookup, no ordering guarantees
Skip lists: O(log n) lookup, but maintains sorted order
Skip lists support range queries, hash maps do not
Hash maps have better cache locality
Skip lists better for applications needing ordered iteration

17. What are the implications of using skip lists in database indexes?

Expected answer points:

Skip lists provide O(log n) point queries and range scans
Redis uses skip lists for sorted sets (ZSET)
Simpler to implement than B-trees for in-memory indexes
Can be made durable with write-ahead logging
Concurrency control is simpler than tree structures

18. How would you optimize a skip list for cache locality?

Expected answer points:

Allocate nodes in cache-friendly patterns (batches or slabs)
Keep level 0 nodes close together in memory
Higher level pointers span larger ranges naturally
Avoid scattered allocations that cause cache misses
Consider using a pool allocator for nodes

19. What is the role of the header node in a skip list?

Expected answer points:

Header node is a sentinel that marks the start of each level
Contains forward pointers at all levels
Never stores actual data (uses -∞ or null value)
Provides consistent starting point for search
All levels initially point to header (no elements yet)

20. How does skip list handle duplicate values?

Expected answer points:

Standard skip lists can store duplicates at level 0
Search returns the first occurrence found
Level-based skipping means only one copy appears in higher levels
Delete removes first match found
Alternative: store (value, unique_id) tuples for total ordering
Redis ZSET allows scored duplicates (different scores = different entries)

Skip Lists: Layered Linked Lists for Fast Search

Introduction

When to Use

When Not to Use

Architecture: Multi-Layer Express Lanes

Implementation

Basic Skip List

Production Failure Scenarios

Common Pitfalls / Anti-Patterns

Observability Checklist

Core Metrics

Health Signals

Alerting Thresholds

Distributed Tracing

Trade-off Analysis

Security Notes

Quick Recap Checklist

Interview Questions

Further Reading

Conclusion

Category

Tags

Related Posts

AVL Trees: Self-Balancing Binary Search Trees

Bellman-Ford Algorithm: Shortest Paths with Negative Weights

Dijkstra's Algorithm: Finding Shortest Paths in Weighted Graphs