Heaps: Min-Heaps, Max-Heaps, and Priority Queues
Master heap data structure, heap operations, priority queue implementation, and heap sort with binary and Fibonacci heap variants.
Heaps: Min-Heaps, Max-Heaps, and Priority Queues
Introduction
A heap is a complete binary tree where every parent node satisfies the heap property: either every parent is less than or equal to its children (min-heap) or every parent is greater than or equal to its children (max-heap). This constraint gives O(1) access to the minimum/maximum element while maintaining O(log n) insertion and deletion.
Heaps are the backbone of priority queues, Dijkstra’s algorithm, heap sort, and many other algorithms. The tree is stored as an array, using index arithmetic to navigate parent-child relationships.
Heap Implementation
class MinHeap:
"""
Min-heap implementation using array representation.
Parent at index i → children at 2i+1 and 2i+2
Child at index i → parent at (i-1) // 2
"""
def __init__(self):
self.heap = []
def _parent(self, i):
return (i - 1) // 2
def _left_child(self, i):
return 2 * i + 1
def _right_child(self, i):
return 2 * i + 2
def _swap(self, i, j):
self.heap[i], self.heap[j] = self.heap[j], self.heap[i]
def _sift_up(self, i):
"""Move element up to maintain heap property."""
while i > 0 and self.heap[self._parent(i)] > self.heap[i]:
self._swap(i, self._parent(i))
i = self._parent(i)
def _sift_down(self, i):
"""Move element down to maintain heap property."""
n = len(self.heap)
while self._left_child(i) < n:
smallest = self._left_child(i)
if (self._right_child(i) < n and
self.heap[self._right_child(i)] < self.heap[smallest]):
smallest = self._right_child(i)
if self.heap[i] <= self.heap[smallest]:
break
self._swap(i, smallest)
i = smallest
def push(self, val):
"""O(log n) insert."""
self.heap.append(val)
self._sift_up(len(self.heap) - 1)
def pop(self):
"""O(log n) extract minimum."""
if not self.heap:
raise IndexError("Heap is empty")
min_val = self.heap[0]
last = self.heap.pop()
if self.heap:
self.heap[0] = last
self._sift_down(0)
return min_val
def peek(self):
"""O(1) view minimum."""
if not self.heap:
raise IndexError("Heap is empty")
return self.heap[0]
def __len__(self):
return len(self.heap)
## Heap Structure and Array Mapping
```mermaid
graph TD
subgraph Tree["Heap as Tree"]
direction TB
R["[0] 10"] --> L["[1] 20"]
R --> M["[2] 30"]
L --> LL["[3] 40"]
L --> LM["[4] 50"]
M --> ML["[5] 60"]
M --> MR["[6] 70"]
end
subgraph Array["Array: [10, 20, 30, 40, 50, 60, 70]"]
direction LR
I0["[0]"] --> I1["[1]"] --> I2["[2]"] --> I3["[3]"] --> I4["[4]"] --> I5["[5]"] --> I6["[6]"]
end
Parent at index i has children at 2i+1 and 2i+2. Child at index i has parent at (i-1) // 2. This array representation makes heaps cache-friendly compared to pointer-based trees.
Min-Heap vs Max-Heap Operations
graph LR
subgraph MinHeap["Min-Heap (parent ≤ children)"]
direction TB
MRoot["[0] 10 (min)"]
MRoot --> MLeft["[1] 20"]
MRoot --> MRight["[2] 30"]
MLeft --> MLL["[3] 40"]
MLeft --> MLM["[4] 50"]
MOp1["push(5)"]
MOp2["sifts up to root"]
MOp3["pop() → 5 (minimum)"]
end
subgraph MaxHeap["Max-Heap (parent ≥ children)"]
direction TB
XRoot["[0] 70 (max)"]
XRoot --> XLeft["[1] 50"]
XRoot --> XRight["[2] 40"]
XLeft --> XLL["[3] 20"]
XLeft --> XLM["[4] 10"]
XOp1["push(80)"]
XOp2["sifts up to root"]
XOp3["pop() → 80 (maximum)"]
end
class MaxHeap:
"""Max-heap: parent is always greater than or equal to children."""
def __init__(self):
self.heap = []
def _parent(self, i):
return (i - 1) // 2
def _left_child(self, i):
return 2 * i + 1
def _right_child(self, i):
return 2 * i + 2
def _swap(self, i, j):
self.heap[i], self.heap[j] = self.heap[j], self.heap[i]
def _sift_up(self, i):
while i > 0 and self.heap[self._parent(i)] < self.heap[i]:
self._swap(i, self._parent(i))
i = self._parent(i)
def _sift_down(self, i):
n = len(self.heap)
while self._left_child(i) < n:
largest = self._left_child(i)
if (self._right_child(i) < n and
self.heap[self._right_child(i)] > self.heap[largest]):
largest = self._right_child(i)
if self.heap[i] >= self.heap[largest]:
break
self._swap(i, largest)
i = largest
def push(self, val):
self.heap.append(val)
self._sift_up(len(self.heap) - 1)
def pop(self):
if not self.heap:
raise IndexError("Heap is empty")
max_val = self.heap[0]
last = self.heap.pop()
if self.heap:
self.heap[0] = last
self._sift_down(0)
return max_val
def peek(self):
if not self.heap:
raise IndexError("Heap is empty")
return self.heap[0]
Priority Queue Using Heap
import heapq
from dataclasses import dataclass
from typing import Any
@dataclass
class PriorityItem:
priority: int
value: Any
class PriorityQueue:
"""
Priority queue using min-heap.
Lower priority number = higher priority (popped first).
"""
def __init__(self):
self._heap = []
self._counter = 0 # Tiebreaker for same priority
def enqueue(self, item, priority=0):
"""O(log n) insert."""
heapq.heappush(self._heap, (priority, self._counter, item))
self._counter += 1
def dequeue(self):
"""O(log n) extract highest priority."""
if not self._heap:
raise IndexError("Priority queue is empty")
return heapq.heappop(self._heap)[2]
def peek(self):
"""O(1) view highest priority."""
if not self._heap:
raise IndexError("Priority queue is empty")
return self._heap[0][2]
def __len__(self):
return len(self._heap)
# For max priority (larger number = higher priority), negate priority:
# enqueue(item, priority=-priority)
# dequeue returns item with highest original priority
Heap Sort
def heap_sort(arr):
"""
Heap sort: O(n log n) worst case, O(1) space (in-place).
Not stable but efficient.
"""
n = len(arr)
# Build max heap (rearrange array)
def sift_down(start, end):
while True:
parent = start
left = 2 * parent + 1
if left > end:
break
# Find larger child
right = left + 1
if right <= end and arr[right] > arr[left]:
left = right
# If child > parent, swap and continue
if arr[left] > arr[parent]:
arr[left], arr[parent] = arr[parent], arr[left]
parent = left
else:
break
# Build max heap: sift down from last non-leaf node to root
for start in range(n // 2 - 1, -1, -1):
sift_down(start, n - 1)
# Extract elements from heap: swap root with end, restore heap
for end in range(n - 1, 0, -1):
arr[0], arr[end] = arr[end], arr[0] # Move max to end
sift_down(0, end - 1) # Restore heap on remaining
return arr
Common Heap Problems
# Kth largest element in stream
class KthLargest:
def __init__(self, k, nums):
self.k = k
self.heap = []
for num in nums:
self.add(num)
def add(self, val):
heapq.heappush(self.heap, val)
if len(self.heap) > self.k:
heapq.heappop(self.heap)
return self.heap[0]
# Merge k sorted arrays
import heapq
def merge_sorted_arrays(sorted_arrays):
result = []
heap = []
# Initialize with first element from each array
for i, arr in enumerate(sorted_arrays):
if arr:
heapq.heappush(heap, (arr[0], i, 0))
while heap:
val, arr_idx, elem_idx = heapq.heappop(heap)
result.append(val)
# Add next from same array
if elem_idx + 1 < len(sorted_arrays[arr_idx]):
next_val = sorted_arrays[arr_idx][elem_idx + 1]
heapq.heappush(heap, (next_val, arr_idx, elem_idx + 1))
return result
# Median finder using two heaps
class MedianFinder:
"""
O(log n) add, O(1) find median.
max_heap stores smaller half, min_heap stores larger half.
"""
def __init__(self):
self.small = [] # Max heap (inverted)
self.large = [] # Min heap
def add_num(self, num):
# Max heap: negate for max-heap behavior
heapq.heappush(self.small, -num)
# Balance: move largest from small to large
heapq.heappush(self.large, -heapq.heappop(self.small))
# Ensure small has >= large elements
if len(self.large) > len(self.small):
heapq.heappush(self.small, -heapq.heappop(self.large))
def find_median(self):
if len(self.small) > len(self.large):
return -self.small[0]
return (-self.small[0] + self.large[0]) / 2.0
When to Use Heaps
| Use Case | Heap Type | Why |
|---|---|---|
| Priority queue | Min or Max | O(1) access to extreme |
| Dijkstra’s algorithm | Min | Always process shortest distance first |
| Top-k elements | Min of size k | Only keep k smallest/largest |
| Median finder | Two heaps | Balance to get middle elements |
| Merge sorted streams | Min | Always get globally minimum |
Heap vs Binary Search Tree
| Aspect | Heap | BST |
|---|---|---|
| Insert | O(log n) | O(log n) |
| Delete min/max | O(log n) | O(log n) |
| Search arbitrary | O(n) | O(log n) |
| Find min/max | O(1) | O(log n)* |
| Ordered traversal | No | Yes |
| Balanced required | No** | Yes |
*BST if balanced **Height can degrade to O(n) if inserted in sorted order
Production Failure Scenarios and Mitigations
Scenario 1: Priority Inversion in Task Scheduling
Priority inversion is one of those bugs that looks impossible on paper. A low-priority task holds a lock, a high-priority task needs that lock, and an medium-priority task runs in between—which means the high-priority task sits blocked while the medium task wastes CPU cycles. Classic.
NASA’s Opportunity rover ran into this during wheel control. The system would get stuck, requiring restarts. The fix was priority inheritance: the low-priority task temporarily bumps up to the blocked task’s priority so it finishes and releases the resource. Most real-time schedulers (POSIX, FreeRTOS) support this, but you have to enable it explicitly.
Python’s heapq won’t save you here—it doesn’t handle priority inheritance at all. For production work with dependencies, look at libraries like TaskTiger that track task relationships.
Scenario 2: Heap Corruption Under Concurrency
Two threads pushing to a shared heap without locks. The heap array ends up with duplicates, missing elements, or a violated parent-child relationship. The worst part is the corruption is often silent—you don’t get an exception, you just get wrong answers later.
I once debugged a payment service that crashed randomly under load. Turns out the priority queue was shared across threads without synchronization. Once we added a mutex wrapper, the crashes stopped.
import threading
import heapq
class ThreadSafeHeap:
def __init__(self):
self._heap = []
self._lock = threading.Lock()
def push(self, item):
with self._lock:
heapq.heappush(self._heap, item)
def pop(self):
with self._lock:
return heapq.heappop(self._heap)
Python’s queue.PriorityQueue handles locking internally. Rust’s binary_heap needs an explicit mutex if you’re sharing across threads.
Scenario 3: Unbounded Queue Growth
Producers outpacing consumers is a classic scaling problem. The queue keeps growing, memory fills up, and eventually the system starts swapping or dies.
A distributed indexing service I heard about had this happen during a traffic spike. The priority queue ballooned to 40GB before things went sideways. The fix: bounded queues with backpressure. Either reject new items, block producers, or drop low-priority items when at capacity.
def enqueue_bounded(self, item, priority, max_size=10000):
with self._lock:
if len(self._heap) >= max_size:
# If the new item can't beat the worst item, reject it
if priority < self._heap[0]:
return False
heapq.heappop(self._heap) # Make room
heapq.heappush(self._heap, (priority, self._counter, item))
self._counter += 1
return True
Scenario 4: Fibonacci Heap Lazy Deletion Problem
Fibonacci heaps look fantastic in algorithm textbooks. O(1) amortized insert! What could go wrong?
The “lazy deletion” part. Deleted nodes stick around marked as “removed” but not physically removed from the structure. Under heavy delete-min workload, you end up with a heap full of dead weight. Performance stops being O(1) amortized and starts looking more like O(n).
NetworkX, the Python graph library, actually falls back to binary heap for certain operations precisely because of this. Graph algorithms tend to delete a lot.
The fix: track deleted-node count and trigger compaction when the ratio gets too high. Or just use binary heap and save yourself the debugging headaches.
Trade-off Analysis
| Heap Type | Push | Pop | Peek | Memory | Best for |
|---|---|---|---|---|---|
| Binary Heap | O(log n) | O(log n) | O(1) | O(n) | Most everyday priority queues |
| Fibonacci Heap | O(1) amort. | O(log n)* | O(1) | O(n) | Graph algos with decrease-key |
| Binomial Heap | O(log n) | O(log n) | O(1) | O(n) | When you need mergeable queues |
| d-ary Heap | O(log_d n) | O(d log_d n) | O(1) | O(n) | Cache-sensitive workloads |
| Pairing Heap | O(1) | O(log n)** | O(1) | O(n) | Practical when Fibonacci feels slow |
| Skew Heap | O(1) amort. | O(log n)* | O(1) | O(n) | Simple code, decent bounds |
*Amortized bound **Pairing heaps aren’t formally bounded but perform well in practice
For most use cases, binary heap is the answer. Fibonacci heaps have real constant-factor overhead that eats the theoretical advantage. d-ary heaps (d=4 or d=8) are worth knowing about when cache locality matters.
Observability Checklist
- Track heap size and alert on unexpected growth
- Monitor push/pop latency—spikes point to contention or heap degradation
- Watch memory usage, especially with millions of items queued
- Log out-of-memory rejections on bounded queues
- Track priority inversion wait times if your scheduler supports it
- Measure cache miss rate—high L3 misses suggest poor heap locality
- Monitor lock contention on thread-safe heaps
- Alert when queue depth approaches configured limits
- Log dropped low-priority items—means producers are overwhelming consumers
Security Notes
Heap overflows in C/C++: The array underlying a heap can overflow if you’re not careful, corrupting adjacent memory. Use safe string functions and consider a memory-safe language if starting fresh.
Priority manipulation in multi-tenant systems: A bad actor might flood high-priority items to starve everyone else. Rate-limit enqueues per user and cap priority values.
Timing attacks: Heap operation timing varies with tree structure, which can leak information about data. If you’re in a security-sensitive context, use constant-time comparisons.
DoS via heap growth: An attacker who can trigger unbounded push can exhaust memory. Always enforce size limits on external input.
Observability data: Heap dumps and profiler output sometimes contain sensitive queued data. Sanitize before exporting.
Quick Recap Checklist
- Heap is complete binary tree with heap property
- Min-heap: parent ≤ children, O(1) peek at minimum
- Max-heap: parent ≥ children, O(1) peek at maximum
- Stored as array: parent at (i-1)//2, children at 2i+1 and 2i+2
- Push/pop: O(log n) via sift up/down
- Heap sort: O(n log n), in-place, not stable
Interview Questions
Complete binary tree has all levels filled except possibly the last, which fills left to right. This guarantees the tree's height is ⌊log₂(n)⌋—approximately log n. It also enables compact array storage with no gaps. There's no need for pointers; parent-child relationships are computed via index arithmetic. This makes heap very cache-friendly compared to pointer-based trees.
Min-heap maintains parent ≤ children, so the minimum is always at the root (O(1) peek). Max-heap maintains parent ≥ children, so the maximum is at the root. The implementations are identical—just flip the comparison operator. Heaps don't support arbitrary access by value; they're optimized for extreme extraction, not search.
Use two heaps: a max-heap for the smaller half and a min-heap for the larger half. Invariant: max-heap has all elements ≤ min-heap's elements, and sizes differ by at most 1. When adding: push to max-heap, move max of max-heap to min-heap, rebalance if needed. Median is either the top of max-heap (odd total) or average of both tops (even total). Both operations are O(log n).
O(n log n) worst case, best case, and average case. Building the heap is O(n), then we perform n extract-min/max operations, each O(log n). Unlike quicksort, heap sort doesn't have a bad O(n²) worst case. However, it's not stable (equal elements may change relative order) and has worse cache performance than quicksort. In practice, quicksort's smaller constants often make it faster, despite the theoretical worst case.
Building a heap from an unsorted array using Floyd's method (bottom-up sift-down from the last non-leaf node to the root) runs in O(n) time. Intuitively: there are n/2 nodes at the bottom level that need 0 swaps, n/4 at the next level that need at most 1 swap, and so on. The sum of these operations converges to O(n). This is significantly better than the O(n log n) cost of n repeated insertions.
Dijkstra's algorithm uses a min-heap (priority queue) to always process the unvisited vertex with the smallest known distance. Initially, the source has distance 0 and all other vertices have distance infinity. The algorithm repeatedly extracts the minimum-distance vertex, relaxes its outgoing edges (updating neighbors' distances), and pushes updated distances into the heap. With a binary heap, the overall complexity is O((V + E) log V). Lazy deletion (ignoring stale entries) is a common optimization to avoid implementing decrease-key.
Heaps power a wide range of systems and algorithms:
- Priority queues in operating system task schedulers and networking (packet queuing)
- Dijkstra's algorithm and Prim's MST algorithm in graph theory
- Heap sort — guaranteed O(n log n) in-place sort
- Top-k / streaming statistics — maintaining k largest elements with a min-heap of size k
- Median maintenance — two-heap technique for streaming median computation
- Event-driven simulation — scheduling future events by timestamp
Decrease-key reduces the priority of an existing element and restores the heap property via sift-up. To implement it efficiently, maintain a map from values to their current indices in the heap array. When a key decreases, look up the index, update the value, and sift up. This operation is O(log n). Decrease-key is critical in Dijkstra's algorithm and Prim's algorithm, where it avoids pushing duplicate entries. In practice, many implementations skip decrease-key and use lazy deletion (pushing new entries and ignoring stale ones when popped), accepting the space overhead for simpler code.
A d-ary heap is a generalization of the binary heap where each node has d children instead of 2. The tree height is reduced to logd n, making push operations faster (fewer levels to sift up). However, pop operations become slower because sift-down must compare all d children at each level (O(d logd n)). d-ary heaps are most useful for cache-sensitive workloads where fewer pointer jumps improve locality. The sweet spot is typically d = 4 or d = 8, offering a good balance between push and pop performance.
When two items share the same priority, the tie-breaking behavior depends on the implementation. The simplest and most robust approach is to include a monotonically increasing counter as a secondary key:
- Store tuples of
(priority, counter, item)in the heap - Items with the same priority are ordered by insertion order (FIFO)
- This also prevents
heapqfrom comparing items directly when priorities are equal
Without a tiebreaker, Python's heapq falls back to comparing the item tuples,
which fails if items are not mutually comparable (e.g., dictionaries or custom objects).
Binary heap: O(log n) for both push and pop, O(n) build time. Simple implementation, excellent cache locality, constant factors are small.
Fibonacci heap: O(1) amortized insert, O(log n) amortized delete-min, and O(1) decrease-key. The theoretical advantages are real but often don't materialize in practice due to:
- High constant factors from bookkeeping (degree tracking, marking)
- Poor cache locality from pointer-chasing across many heap fragments
- Lazy deletion accumulation unless explicitly managed
Use binary heap for most production code. Fibonacci heaps matter most when decrease-key is the bottleneck (e.g., Prim's or Dijkstra's with many edge relaxations).
The approach depends on your heap type and constraints:
- Binary heap: Merging requires moving all elements from one heap into the other, O(n) time. Simply concatenate the arrays and run Floyd's heapify (bottom-up sift-down) in O(n).
- Binomial heap: Named for its structure—merge is O(log n) by combining trees of equal degree. Supports efficient union of mergeable priority queues.
- Fibonacci heap: O(1) merge—just concatenate root lists. This is why it's preferred for algorithms like Prim's that repeatedly merge heaps.
If you frequently merge priority queues, consider binomial or Fibonacci heaps. For occasional merges, binary heap with O(n) merge is often acceptable.
Heap sort is not stable because the heap operations can change the relative order of equal elements. When we swap the root with the last element and sift down, equal elements may end up in different positions relative to each other.
Whether stability matters depends on the use case:
- Sorting primitive values: stability is irrelevant
- Sorting records by multiple keys: stability preserves the first key's ordering when second key is equal
- When stability is needed: use merge sort or Tim sort
Heap sort's lack of stability is rarely the deciding factor. Its O(n log n) worst case and in-place O(1) space are usually the more important characteristics.
OS schedulers use priority queues (often heaps) to manage runnable processes:
- Priority scheduling: Heap always yields the highest-priority runnable process
- CFS (Completely Fair Scheduler) in Linux: Uses a red-black tree (not heap) for fair-time accounting, but heaps appear in real-time scheduling classes
- Real-time systems: Priority inheritance protocols prevent priority inversion when tasks share resources
The heap property ensures O(log n) insertion and O(1) peek for the highest-priority task. Under heavy load, a priority queue backed by a heap can become a bottleneck—some systems use specialized data structures or batching strategies.
Python's heapq only supports min-heap operations directly. Two common approaches
to implement a max-heap:
- Negate all values: Store
-valuein the heap. When you pop, negate again to get the original max value. This is the most common approach. - Custom comparator: Store tuples
(-priority, counter, item)and let heapq handle the ordering
For a priority queue with custom objects, use enqueue(item, priority=-priority)
to treat larger priority values as higher priority. When dequeuing, the item returned is
the one with the most negative priority (i.e., highest original priority).
Yes, there are two main approaches:
- Min-heap of all elements: Build a min-heap from all n elements (O(n)) and pop k times. Time: O(n + k log n). Good when k is small or you need multiple queries.
- Max-heap of size k: Maintain a max-heap of the k smallest elements seen so far. For each new element, if it's smaller than the heap's max, replace the max. At the end, the heap's max is the k-th smallest. Time: O(n log k). Good when k is much smaller than n.
The max-heap of size k approach is optimal for the "top-k smallest" problem when k << n. It's how you find "Kth largest element in a stream" efficiently.
All primary heap operations have O(log n) worst-case complexity:
- Push: Element starts at the bottom and may sift up all the way to the root. Tree height is ⌊log₂(n)⌋, so at most log n swaps.
- Pop: Root is replaced by the last element and must sift down. At each level, we compare with both children and swap with the smaller/larger. Again, at most log n levels.
- Peek: O(1)—the root is always at index 0.
The O(log n) bound is tight because the heap is a complete binary tree with height ⌊log₂(n)⌋, and both sift-up and sift-down traverse the full height in the worst case.
Pairing heap is a simplified Fibonacci heap with:
- Simpler implementation—no marking or degree tracking
- Amortized O(1) insert and O(log n) delete-min (no proven worst-case bounds)
- Excellent practical performance—often outperforms Fibonacci heaps despite worse theoretical bounds
Fibonacci heap has proven O(1) amortized insert and O(log n) amortized delete-min, but the constants are high. In practice, pairing heaps are preferred when you need mergeable heaps with good performance.
For most applications, binary heap's O(log n) worst case with small constants beats both Fibonacci and pairing heaps in practice.
Priority inheritance is a protocol to solve priority inversion:
- A low-priority task holds a lock needed by a high-priority task
- A medium-priority task preempts the low-priority task (which can't run to release the lock)
- The high-priority task is blocked indefinitely while the medium task wastes CPU
The fix: When a high-priority task waits on a lock held by a low-priority task, the low-priority task temporarily inherits the blocked task's priority. It runs at higher priority, finishes faster, releases the lock, and the high-priority task proceeds.
Invented for real-time systems (Mars Pathfinder incident is a famous case). Most real-time OS kernels (POSIX, FreeRTOS, VxWorks) support it. Database systems use similar concepts for lock manager scheduling.
A priority-based task scheduler typically uses:
- Min-heap of tasks: Each task has (priority, scheduled_time, task_id). The heap's root is always the highest-priority, earliest task.
- Polling loop: While running, peek at the heap. If the next task's time has arrived, pop and execute. Otherwise, sleep until that time.
- Cancellation support: Maintain a set of cancelled task IDs. Before executing, check if the task was cancelled.
- Backpressure: Enforce a max heap size to prevent memory exhaustion from a flooded scheduler.
Thread safety: Wrap heap operations with a mutex or use queue.PriorityQueue
which handles locking internally. For high-throughput schedulers, consider lock-free
priority queues (used in kernel schedulers).
Further Reading
Binary Heaps and Priority Queues
- Introduction to Algorithms (CLRS), Chapter 6 — The definitive reference on binary heaps, heap sort, and priority queues. Covers correctness proofs and complexity analysis in depth.
- Algorithms (Sedgewick & Wayne), Chapter 2.4 — Practical coverage of priority queues with Java implementations, including indexed heaps used in Dijkstra’s algorithm.
Advanced Heap Variants
- Fibonacci Heaps — Fredman & Tarjan’s original paper (1987) on Fibonacci heaps with O(1) amortized decrease-key. Essential reading for understanding amortized analysis and advanced data structures.
- Binomial Heaps — CLRS Chapter 19 covers binomial heaps, a mergeable heap variant supporting O(log n) union operations.
- Pairing Heaps — Fredman et al. (1986). Simpler than Fibonacci heaps with excellent practical performance. Used in some implementations of Prim’s and Dijkstra’s algorithms.
- d-ary Heaps — Johnson (1975). Generalization where each node has d children. Higher d yields shorter trees but more expensive sift-down. Sweet spot is often d=4 or d=8 for cache-conscious workloads.
Applications Deep Dives
- Dijkstra’s Algorithm with Priority Queue — The classic shortest-path algorithm uses a min-heap to always process the closest unvisited node. With a binary heap, complexity is O((V+E) log V). With a Fibonacci heap (supporting O(1) decrease-key), it improves to O(V log V + E), though high constants often make binary heap preferable in practice.
- Heap Sort vs. Quick Sort — Both are O(n log n) on average, but heap sort is not stable and has worse cache behavior. Quick sort’s divide-and-conquer pattern leverages cache better for most inputs. Heap sort’s key advantage is its guaranteed O(n log n) worst case with O(1) extra space.
- Task Scheduling with Priority Queues — Operating systems use priority queues extensively for process scheduling. Real-time systems add priority inheritance to avoid inversion. See
SCHED_FIFOandSCHED_RRin the Linux kernel scheduler for real-world examples.
Conclusion
Heaps give you O(1) access to the min or max with O(log n) inserts and deletes. That’s why they’re the natural choice for priority queues, Dijkstra’s, and heap sort. The key insight is that a heap is just an array pretending to be a tree—parent-child navigation is pure index math, no pointers needed. For Python priority queues, heapq with a tiebreaker counter handles same-priority ordering cleanly. If you’re building systems that need top-k tracking, medians, or sorted stream merging, heaps are usually the right call.
Category
Related Posts
Arrays vs Linked Lists: Understanding the Trade-offs
Compare arrays and linked lists in terms of access time, insertion/deletion efficiency, memory usage, and cache performance.
Arrays: 1D, 2D, and Multi-dimensional Mastery
Master array operations, traversal, search, common patterns like two-pointer and sliding window, and when to use multi-dimensional arrays.
AVL Trees: Self-Balancing Binary Search Trees
Master AVL tree rotations, balance factors, and rebalancing logic. Learn when to use AVL vs Red-Black trees for your use case.