HashMap in Java

Understand HashMap: key-value storage with O(1) average lookups, hash collisions, rehashing, and performance characteristics.

published: May 23, 2026 reading time: 14 min read author: GeekWorkBench

HashMap in Java

HashMap is Java’s primary key-value mapping implementation. It stores entries in an internal array of buckets, using each key’s hashCode() to determine the bucket and equals() to distinguish between keys in the same bucket. This combination provides O(1) average-case put and get operations.

Introduction

HashMap is the go-to data structure when you need to store and retrieve values by a key — fast. Under the hood, it turns each key into a bucket index using hashCode(), then uses equals() to tell apart the different keys that land in the same bucket. Done correctly, this gives you O(1) average-case insertion and lookup, regardless of how many entries you store. Done wrong — a broken hashCode() or a mutable key — and you get O(n) performance that degrades silently as the map grows.

The critical trap with HashMap is treating it as a simple dictionary when it is actually a carefully tuned data structure with real constraints. Keys must not change their hashCode() after insertion (or the entry becomes permanently unfindable). The map is not thread-safe — concurrent modification from multiple threads can corrupt internal state or throw ConcurrentModificationException. And because the backing array resizes when the load factor threshold is crossed, a burst of insertions can trigger an O(n) rehashing operation at the worst possible moment.

This post covers HashMap internals — bucket arrays, collision handling via chaining and treeification, the hash() spreading function — along with the failure scenarios that cause real production incidents: hash flooding attacks, unbounded growth, and the subtle difference between get() returning null because the key is absent versus because the value was null.

When to Use HashMap

Use HashMap when:

You need to store and retrieve values by a unique key
O(1) lookup performance is critical
Keys are not ordered (if ordering matters, use TreeMap or LinkedHashMap)
You need a mutable map; for immutable snapshots, consider Map.of() or Collections.unmodifiableMap()

Do not use HashMap when:

You need sorted or insertion-order iteration (use TreeMap or LinkedHashMap)
You need thread-safe operations (use ConcurrentHashMap)
Keys may be null in a concurrent context (single null key is allowed but not thread-safe)
You need primitive-key maps without boxing (use Trove or other specialized libraries)

Internal Structure

HashMap uses an array of Node<K,V> buckets. Each Node stores the key, value, hash, and a pointer to the next node in the bucket (for collision handling via chaining):

transient Node<K,V>[] table;
transient int size;

// When a bucket has > 8 entries, it converts to a TreeNode (Red-Black tree)
static final int TREEIFY_THRESHOLD = 8;

Mermaid Diagram: HashMap Structure

graph TD
    A["HashMap<K,V>"] --> B["table: Node[16]"]
    B --> C["bucket[0] → Node<K,V> → null"]
    B --> D["bucket[1] → Node → Node → null"]
    B --> E["bucket[2] null"]
    B --> F["bucket[3] → TreeNode (8+ entries)"]
    D --> D1["hash: 17<br/>key<br/>value<br/>next: Node"]
    D1 --> D2["hash: 17<br/>key<br/>value<br/>next: null"]
    F --> F1["TreeNode<br/>Red-Black tree"]

Hash Collisions

When two different keys produce the same hash code (or their hashes map to the same bucket index), a collision occurs. HashMap handles this via chaining — multiple nodes in the same bucket form a linked list (or tree if the chain gets long).

// Simplified collision handling
if (node.next == null) {
    node.next = new Node(hash, key, value, null);
} else {
    // Treeify if chain exceeds TREEIFY_THRESHOLD
}

Failure Scenarios

Scenario	Cause	Result
`NullPointerException`	`null` key and key-based operations	Runtime crash
`ConcurrentModificationException`	Modifying map while iterating	Runtime crash
Performance degradation	Many hash collisions (poor `hashCode`)	O(n) instead of O(1) per operation
`OutOfMemoryError`	Unbounded growth with collisions	HashMap never shrinks

Trade-Off Table

Aspect	HashMap	TreeMap	LinkedHashMap
Ordering	None	Sorted by key (Red-Black)	Insertion or access order
Get/Put complexity	O(1) avg, O(n) worst	O(log n)	O(1) avg
`null` key allowed	Yes (one)	No	Yes (one)
Thread safety	None	None	None
Memory overhead	Lower	Higher (tree structure)	Higher (linked list)

Code Snippets

Basic Operations

Map<String, Integer> ages = new HashMap<>();
ages.put("Alice", 30);
ages.put("Bob", 25);
System.out.println(ages.get("Alice"));    // 30
System.out.println(ages.getOrDefault("Eve", 0));  // 0
ages.remove("Bob");
System.out.println(ages.containsKey("Bob")); // false
System.out.println(ages.size());            // 1

Custom Key with Proper HashCode

class Employee {
    private final int id;
    private final String name;

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        Employee that = (Employee) o;
        return id == that.id;
    }

    @Override
    public int hashCode() {
        return Objects.hash(id);  // Use only immutable field(s)
    }
}

Iterating

for (Map.Entry<String, Integer> entry : ages.entrySet()) {
    System.out.println(entry.getKey() + " -> " + entry.getValue());
}

// KeySet iteration
for (String key : ages.keySet()) { }

// Value iteration
for (Integer val : ages.values()) { }

Observability Checklist

Monitor hashCode() distribution — poor hash functions cause bucket clustering
Track collision counts via Map.Entry traversal on keys with similar hashes
Log size() over time to detect unbounded map growth
Profile for ConcurrentModificationException in high-write scenarios
Monitor TREEIFY_THRESHOLD conversions — tree nodes are slower than linked-list nodes

Security Notes

HashMap is not thread-safe; use ConcurrentHashMap for concurrent access
When using user-supplied keys, a malicious hashCode() implementation could trigger DoS attacks (hash flooding)
Avoid serializing maps containing sensitive data; use encrypted serialization or secure alternatives
Map.of() and Collections.unmodifiableMap() create immutable views that prevent accidental mutation

Common Pitfalls / Anti-Patterns

Mutable keys: If a key’s hashCode() changes after insertion, the entry becomes unretrievable — the map cannot find it
Poor hashCode(): Returning a constant (e.g., return 42) from hashCode() degrades all operations to O(n)
Confusing put() return value: put() returns the previous value for the key, or null if none existed — this is indistinguishable from an actual null value; use putIfAbsent() when distinction matters
null key: Only one null key is allowed; subsequent put(null, value) overwrites the first
Ignoring rehashing: Resizing doubles the bucket count and rehashes every entry — O(n) operation; avoid by estimating capacity upfront

Quick Recap

HashMap provides O(1) average-case put/get by hashing keys to bucket indices
Collisions are handled via chaining (linked list or tree for long chains)
Keys must have stable hashCode() and correct equals() implementations
null key and null values are supported (single null key)
Thread-unsafe; use ConcurrentHashMap or Collections.synchronizedMap() for thread safety

Interview Questions

::: info The following .qa-card components contain typical interview questions you may encounter. Reviewing these will help reinforce key concepts. :::

1. What is a hash collision and how does HashMap handle it?

Model Answer: "A hash collision occurs when two different keys produce the same hash code or map to the same bucket index. `HashMap` handles this via **chaining** — multiple entries in the same bucket form a linked list. When the chain length exceeds 8, it converts to a **Red-Black tree** for O(log n) worst-case lookups instead of O(n)."

2. What determines the bucket index for a key?

Model Answer: "The bucket index is computed as `hash(key) & (capacity - 1)`. The key's `hashCode()` is passed through a mixing function (`hash()` in the source) to distribute bits, then bitwise AND with `capacity - 1` limits the result to a valid bucket index. This is equivalent to `hash % capacity` but faster."

3. What happens when HashMap needs to resize?

Model Answer: "When the number of entries exceeds `capacity * loadFactor` (default: 16 \* 0.75 = 12), `HashMap` **doubles** the bucket array and **rehashes** every entry into the new array. This is an O(n) operation. You can avoid this cost by setting initial capacity upfront: `new HashMap<>(initialCapacity)`."

4. Why should custom key objects override both `hashCode()` and `equals()`?

Model Answer: "`HashMap` uses `hashCode()` to find the bucket and `equals()` to identify the correct entry within that bucket. If you override only one, the contract breaks: two objects that are `equals()` must have the same `hashCode()`, but objects with the same `hashCode()` do not need to be `equals()`. Overriding both ensures consistent behavior. If only `equals()` is overridden, the object may be stored in the wrong bucket; if only `hashCode()` is overridden, it may not be found even when it exists."

5. What is the difference between `HashMap` and `ConcurrentHashMap`?

Model Answer: "`ConcurrentHashMap` is **thread-safe** and supports concurrent reads and writes without locking the entire map. It divides the map into segments (or in Java 8+, uses fine-grained bucket locking). `HashMap` is unsynchronized — concurrent modification from multiple threads can cause corruption, infinite loops, or `ConcurrentModificationException`."

6. What is the purpose of the `hash()` method in HashMap?

Model Answer: "The `hash()` method (also called "spread" or "shuffle") recomputes the hash code to reduce correlation between bits. It applies `hashCode() ^ (hashCode() >>> 16)` — XORing the upper 16 bits with the lower 16 bits. This ensures that bit patterns from poor hash functions (like sequential integers) are better distributed across bucket indices."

7. When does a HashMap bucket convert from linked list to a tree?

Model Answer: "When a single bucket accumulates **8 or more entries** (`TREEIFY_THRESHOLD = 8`), it converts from a linked list to a **Red-Black tree** for O(log n) worst-case lookups instead of O(n). It converts back to a list when the bucket shrinks below **6** (`UNTREEIFY_THRESHOLD = 6`). The intermediate values (6 and 7) prevent oscillation between structures."

8. What is the difference between `get()` and `getOrDefault()`?

Model Answer: "`get(key)` returns the value if the key exists, or `null` if the key is absent or maps to `null`. `getOrDefault(key, defaultValue)` returns the value if present, or the provided `defaultValue` if absent. Use `getOrDefault()` when `null` is a valid stored value and you need to distinguish "not found" from "value is null"."

9. How does HashMap handle hash flooding attacks?

Model Answer: "Hash flooding exploits poor hash functions by feeding inputs that all hash to the same bucket, degrading performance to O(n). Java 8+ mitigates this with treeification (converting long chains to Red-Black trees). Additionally, `Map.of()`, `Set.of()`, and `Collections` utility methods use randomized hashing to resist intentional attacks."

10. Why is it recommended to use immutable types as HashMap keys?

Model Answer: "If a key's fields used in `hashCode()` or `equals()` change after insertion, the key's hash code changes, making it effectively **unretrievable** — it stays in its original bucket but the map cannot find it. Using immutable types (String, Integer, etc.) or treating custom keys as immutable ensures hash code stability throughout the key's lifetime in the map."

11. What is the difference between `putIfAbsent()` and `put()`?

Model Answer: "`put()` always overwrites the existing value and returns the previous value (or null if none). `putIfAbsent()` only inserts if the key is **not already present** — it returns the existing value if found (including when the existing value is `null`), and returns `null` only when the key was actually added."

12. What is the difference between `HashMap` and `Hashtable`?

Model Answer: "`Hashtable` is the legacy synchronized implementation (Java 1.0). It does not allow `null` keys or values, and all methods are synchronized. `HashMap` is newer, unsynchronized, and allows one `null` key and multiple `null` values. For concurrent access, use `ConcurrentHashMap` instead of `Hashtable`."

13. How does HashMap determine when to resize its bucket array?

Model Answer: "When the number of entries exceeds `capacity * loadFactor` (default threshold: 16 * 0.75 = 12), `HashMap` doubles the bucket array and rehashes every entry. This is an O(n) operation done infrequently but can cause latency spikes under heavy write loads."

14. What is the purpose of the `hash()` method in HashMap?

15. What is the relationship between HashMap's initial capacity and load factor?

Model Answer: "Initial capacity sets the starting bucket array size (default 16). Load factor (default 0.75) determines when to resize. For known size requirements, set initial capacity to avoid intermediate resizes: `new HashMap<>(expectedSize / 0.75 + 1)`. This prevents O(n) rehashing during growth."

16. Can two different keys have the same hash code in HashMap?

Model Answer: "Yes, different keys can have the same hash code (collision). HashMap uses both `hashCode()` to find the bucket and `equals()` to find the correct entry within the bucket. Two keys with the same hash but different `equals()` results are stored separately in the same bucket as distinct entries."

17. How does HashMap's `computeIfAbsent()` method work?

Model Answer: "`computeIfAbsent(key, mappingFunction)` returns the existing value if present. If absent, it computes a new value using the mapping function, stores it, and returns it. This is atomic and useful for lazy initialization and cache patterns. The mapping function is invoked at most once per key."

18. What is a HashMap collision and how does it affect performance?

Model Answer: "A collision occurs when two different keys hash to the same bucket. HashMap handles this via chaining (linked list or tree). With many collisions in one bucket, lookups degrade from O(1) to O(n) for linked lists or O(log n) for trees. Java 8+ converts chains of 8+ nodes to a Red-Black tree for better worst-case performance."

19. Why does HashMap use separate chaining and tree structures for collision handling?

Model Answer: "Linked lists are efficient for few collisions but degrade to O(n) with many. When a bucket has 8+ entries (TREEIFY_THRESHOLD), Java converts to a Red-Black tree for O(log n) worst-case. This hybrid approach balances memory overhead (trees need more nodes) against lookup performance under heavy collision scenarios."

20. What is the difference between `clear()` and `remove()` in HashMap?

Model Answer: "`clear()` removes all entries (sets size to 0) but keeps the bucket array. `remove(key)` removes a single entry by key and decrements size. After `clear()`, the capacity remains the same; after many `remove()` calls, capacity is unchanged (HashMap never shrinks automatically)."

Conclusion

HashMap is the go-to key-value store in Java, delivering O(1) average-case get and put through hash-based bucketing. The critical contract is that keys provide stable hashCode() and correct equals() implementations — violating either degrades performance unpredictably.

HashMap works best when keys are simple, immutable types (String, Integer, etc.). When using custom objects as keys, treat hashCode() and equals() as a permanent commitment tied to the object’s identity. HashMap is unordered; if insertion order or sorted-key iteration matters, look at TreeMap or LinkedHashMap.

HashSet is the set equivalent of HashMap — it uses the same internal mechanics with a dummy value. For unique-element collections with O(1) lookups, HashSet is the direct counterpart to explore after HashMap.

HashMap in Java

Introduction

When to Use HashMap

Internal Structure

Mermaid Diagram: HashMap Structure

Hash Collisions

Failure Scenarios

Trade-Off Table

Code Snippets

Basic Operations

Custom Key with Proper HashCode

Iterating

Observability Checklist

Security Notes

Common Pitfalls / Anti-Patterns

Quick Recap

Interview Questions

Further Reading

Conclusion

Category

Tags

Related Posts

Abstract Classes in Java

Arithmetic Operators in Java

Array Basics in Java