Heap Dump Analysis: Finding Memory Leaks with MAT, VisualVM

Complete guide to JVM heap dump analysis using Eclipse MAT, VisualVM, and YourKit for identifying memory leaks and optimization opportunities.

published: reading time: 22 min read author: GeekWorkBench

Heap Dump Analysis: Detecting Memory Leaks with MAT, VisualVM, and YourKit

Heap dumps capture a snapshot of everything in your JVM memory at a specific moment. They are the definitive source for understanding memory issues—memory leaks, excessive allocation, classloader problems, and unexpected object retention. Without heap dump analysis, you are guessing about memory problems. With it, you have concrete evidence.

This guide covers three essential tools for heap dump analysis. Eclipse Memory Analyzer Tool (MAT) is free, powerful, and the first tool to reach for. VisualVM comes with the JDK and handles quick checks well. YourKit is commercial but offers superior usability and advanced features.

Introduction

Heap dumps capture a frozen moment of everything in your JVM’s memory—all live objects, their references, classloaders, and the full application state at the moment of capture. They are the definitive source of truth when diagnosing memory issues that allocation profilers cannot explain, when OutOfMemoryError occurs without a clear cause, or when memory usage grows over time and you need to find what is holding references that should have been released. Without heap dump analysis, you are guessing about memory problems. With it, you have concrete evidence.

Three tools dominate heap dump analysis: Eclipse Memory Analyzer Tool (MAT), VisualVM, and YourKit. MAT is free, powerful, and the first tool to reach for when investigating production incidents. It uses dominator trees and leak suspect reports to identify which objects retain the most memory and why they cannot be collected. VisualVM comes with the JDK and handles quick checks without additional installation. YourKit is commercial but excels at allocation site analysis—showing you not just what objects exist but where they were created. Each tool answers different questions about the same memory state.

This guide covers how to capture heap dumps reliably (including when the JVM is under extreme memory pressure), how to use each tool effectively, and how to trace retention chains all the way to GC roots. Failure scenarios show how static caches, classloader leaks, and connection pool exhaustion show up in dumps, and what patterns to look for when the obvious suspect turns out to be a symptom rather than the root cause.

When to Use Heap Dump Analysis / When Not to Use

Heap dumps are indispensable when you need to understand why memory is not being released, when OutOfMemoryError occurs, when memory usage grows over time without explanation, and when you need to find what is holding references to objects that should be garbage collected.

Avoid heap dumps when the JVM is at extreme memory pressure—taking a dump can cause an OutOfMemoryError or hung process. For performance issues that are not memory-related, heap dumps waste time. And if you need real-time allocation monitoring, use async-profiler allocation profiling instead.

Capturing heap dumps in production requires careful planning. The dump operation is stop-the-world for most JVM configurations, though modern JVMs with -XX:+HeapDumpOnOutOfMemoryError capture dumps without additional overhead when the error occurs.

Heap Dump Architecture

graph TD
    subgraph "JVM Heap"
        subgraph "Object Instances"
            O1[MyService<br/>1 instance<br/>256MB retained]
            O2[Cache<br/>1 instance<br/>128MB retained]
            O3[Connection Pool<br/>1 instance<br/>64MB retained]
        end
        subgraph "Reference Chains"
            O1 --> O2
            O2 --> O3
        end
    end

    subgraph "Heap Dump Tools"
        MAT[Eclipse MAT<br/>Dominator Tree<br/>Leak Suspects]
        VV[VisualVM<br/>Heap Summary<br/>OQL Queries]
        YK[YourKit<br/>Allocation Sites<br/>Retained Sets]
    end

    O1 -->|"Analyze"| MAT
    O2 -->|"Analyze"| VV
    O3 -->|"Analyze"| YK

Each tool approaches heap dump analysis differently. MAT focuses on finding memory waste through dominator trees and leak suspect reports. VisualVM provides a quick overview and supports object query language for custom analysis. YourKit emphasizes navigation—you can trace from any object to its GC roots with a click.

## Capturing Heap Dumps

### Using jmap

```bash
# Capture heap dump (requires JDK, not JRE)
jmap -dump:format=b,file=heapdump.hprof <pid>

# Dump only live objects (triggers GC first)
jmap -dump:live,format=b,file=heapdump-live.hprof <pid>

# Non-interactive dump with -F flag (if process is hung)
jmap -F -dump:format=b,file=heapdump.hprof <pid>

Automatic Dump on OutOfMemoryError

# JVM flags for automatic heap dump
java -XX:+HeapDumpOnOutOfMemoryError \
     -XX:HeapDumpPath=/var/log/heapdump.hprof \
     -Xmx2g -jar your-application.jar

Add a rollover policy to keep only recent dumps:

# Keep only last 10 dumps, each max 1GB
-XX:HeapDumpPath=/var/log/heapdump.hprof \
-XX:+HeapDumpOnOutOfMemoryError \
-XX:NumberOfHeapDumpFiles=10 \
-XX:HeapDumpSize=1G

Using Diagnostic Commands

# Via jcmd (Java 9+)
jcmd <pid> GC.heap_dump filename=heapdump.hprof

# With live objects only
jcmd <pid> GC.heap_dump filename=heapdump.hprof live=true

Eclipse Memory Analyzer (MAT)

MAT is the most powerful free tool for heap dump analysis. The Dominator Tree view shows which objects retain the most memory and identifies why they cannot be collected.

Key Views

Leak Suspect Report automatically analyzes the dump and identifies likely memory problems. Start here for any new heap dump. The report explains which objects contribute most to memory retention and why they cannot be collected.

Dominator Tree shows the object retention hierarchy. An object A dominates object B if every path from a GC root to B passes through A. This tells you exactly which “owner” keeps other objects alive.

Histogram shows the count and shallow heap of objects by class. Sort by retained heap to find the biggest consumers.

MAT Analysis Workflow

# Open heap dump in MAT
# File -> Open Heap Dump -> select heapdump.hprof

# Run leak suspect report
# Right-click on heap dump in Heap Dump History -> "Leak Suspect Report"

In the leak suspect report, look for sections marked in red—they indicate problems. The “Shortest Paths to GC Roots” show why objects cannot be collected. The “Accumulated Objects” section shows which classes accumulate excessively.

VisualVM

VisualVM comes with the JDK and provides quick heap analysis without installing additional tools. It is less powerful than MAT but handles basic analysis quickly.

Key Features

Heapwalker provides the main heap analysis interface. The “Summary” tab shows overall heap statistics. The “Classes” tab shows object counts and sizes by class. The “Instances” tab shows actual objects and their fields. Right-click any instance to see “Paths to GC Roots.”

OQL Console supports Object Query Language for custom searches. Useful for finding specific patterns, like objects with particular field values or collections of unusual size.

Example OQL Queries

-- Find all ArrayList instances larger than 1000 elements
select a from java.util.ArrayList a where a.size > 1000

-- Find String instances containing specific text
select s from java.lang.String s where s.toString().indexOf("suspicious") >= 0

-- Find objects with retention over 10MB
select x from instanceof java.lang.Object x
where sizeof(x) > 10485760

YourKit

YourKit is a commercial profiler with the best usability for heap analysis. Its strength is navigation—you can click through object relationships intuitively.

Key Capabilities

Allocation recording captures where objects were created, not just that they exist. When analyzing heap dumps, the allocation site is shown alongside each object. This dramatically accelerates leak diagnosis because you know exactly which code path created leaking objects.

Retained set view shows exactly what a specific object keeps alive. Unlike MAT’s dominator tree, YourKit shows the full set of objects that would be collected if a specific object were collected.

Compare heap dumps takes two dumps and shows what changed. Objects that grew in count indicate potential leaks. This is invaluable for confirming whether memory issues are resolved.

Production Failure Scenarios

Scenario 1: Static Cache Growing Unboundedly

An application cache stores data forever instead of evicting old entries. The heap dump shows ConcurrentHashMap or similar collection retaining millions of entries. The dominator tree shows the cache itself as the top retainer.

The solution is implementing eviction policies—LinkedHashMap with access-order eviction, WeakHashMap for reference-based eviction, or dedicated cache libraries like Caffeine that handle eviction automatically.

Scenario 2: Classloader Memory Leak

After redeploying a web application, the old version’s classes remain in memory because something still references the old classloader. The heap dump shows classes from the old application still loaded.

This typically happens when static collections hold references to objects that should have been cleaned up. ThreadLocal values, static caches, and JMX MBeans are common culprits.

Scenario 3: Connection Pool Leak

Database connection pools that do not return connections properly accumulate connections over time. The heap dump shows physical database connections held by connection pool proxies that the application never closed.

The “Paths to GC Roots” trace reveals where the leaked connections are held—typically in some business logic object that acquired the connection but failed to return it.

Trade-off Table

ToolCostStrengthBest For
Eclipse MATFreeDeep analysis, leak suspectsProduction incident analysis
VisualVMFree with JDKQuick checks, OQLDevelopment debugging
YourKitCommercialUsability, allocation sitesActive development profiling

Implementation Snippets

Analyzing Heap Dump Programmatically with MAT API

import org.eclipse.mat.*;
import org.eclipse.mat.query.*;
import org.eclipse.mat.snapshot.*;
import org.eclipse.mat.util.*;

public class HeapDumpAnalyzer {

    public static void main(String[] args) throws Exception {
        // Open heap dump
        Snapshot snapshot = SnapshotFactory.openSnapshot(
            "heapdump.hprof",
            null);

        // Get dominator tree
        DominatorTree dominatorTree = snapshot.getDominatorTree();

        // Find largest retained objects
        ResultTable<?> result = dominatorTree.query(
            "SELECT retainedSize, shallowSize, this, className "
            + "ORDER BY retainedSize DESC LIMIT 20");

        for (Object row : result.getRows()) {
            System.out.printf("Retained: %s, Shallow: %s, Class: %s%n",
                formatSize(((Number) ((List<?>) row).get(0)).longValue()),
                formatSize(((Number) ((List<?>) row).get(1)).longValue()),
                row);
        }

        snapshot.dispose();
    }

    private static String formatSize(long bytes) {
        if (bytes < 1024) return bytes + "B";
        if (bytes < 1024 * 1024) return String.format("%.1fKB", bytes / 1024.0);
        if (bytes < 1024 * 1024 * 1024) {
            return String.format("%.1fMB", bytes / (1024.0 * 1024));
        }
        return String.format("%.1fGB", bytes / (1024.0 * 1024 * 1024));
    }
}

Quick Analysis with jhat

# jhat is deprecated but still useful for quick analysis
jhat -J-Xmx2g heapdump.hprof

# Then open http://localhost:7000 in browser
# Use OQL console for queries:
# select x from com.mycompany.MyClass x where x.field != null

Observability Checklist

Before concluding heap analysis is complete, verify these points. The heap dump captured the problematic state—dumps taken too early may not show the leak. The dump included live objects only—dumps with all objects obscure actual retention. You traced retention paths all the way to GC roots, not just to intermediate holders. The root cause is in application code, not a framework or library that you cannot change.

Also confirm that multiple dumps show consistent patterns, that the objects you identified as leaks correlate with the time when memory usage grew, and that your fix actually addresses the root cause rather than symptoms.

Security and Compliance Notes

Heap dumps contain everything in JVM memory at capture time. This includes user data, session information, passwords in memory, and application internals. Treat heap dumps as sensitive data.

For regulated environments, heap dumps may fall under data handling requirements. Identify whether dumps contain PII, financial data, or other regulated information before analysis. Analyze dumps in secure environments. Do not transfer dumps over untrusted networks.

Consider automating heap dumps with strict access controls rather than allowing ad-hoc captures. Log all heap dump operations for audit purposes.

Common Pitfalls / Anti-Patterns

The biggest mistake is analyzing dumps from the wrong time. A dump taken during normal operation looks nothing like one taken during OutOfMemoryError. Always capture dumps at the moment of failure or when memory pressure is highest.

Another issue is confusing retained size with shallow size. Shallow size is the object itself. Retained size includes all objects that would be collected if the object were collected. Always look at retained size when diagnosing memory issues.

Not following references all the way to GC roots leads to fixing symptoms. An object might be held by another object, which is held by a third—keep tracing until you find the GC root.

Finally, forgetting that some leaks are in permanent generation or metaspace (classloaders, interned strings) before Java 8. These require different analysis approaches than heap dumps of Java heap.

Quick Recap Checklist

  • Capture heap dumps with -XX:+HeapDumpOnOutOfMemoryError in production
  • Use MAT for deep analysis and leak suspect reports
  • Use VisualVM for quick checks and OQL queries
  • Use YourKit for allocation site analysis and intuitive navigation
  • Always trace retention paths to GC roots
  • Distinguish retained size from shallow size
  • Protect heap dumps—they contain sensitive application data

Interview Questions

1. What is the difference between shallow heap and retained heap?

Shallow heap is the size of the object itself—the memory consumed by its fields and object header. A String object has a shallow heap of roughly 48 bytes for the object itself, plus whatever its char[] array holds.

Retained heap is the total memory that would be freed if the object were garbage collected. It includes the shallow heap of the object plus the retained heaps of all objects it keeps alive, directly or indirectly. If a Map holds 10,000 entries, removing the Map frees those 10,000 entries too—the Map's retained heap includes them.

When diagnosing memory issues, always look at retained heap. A small object holding references to millions of large objects has a small shallow heap but enormous retained heap.

2. How do you find a memory leak using heap dump analysis?

Start with a heap dump taken when memory pressure is highest. Open it in MAT and run the Leak Suspect Report. The report identifies objects that contribute most to memory retention and explains why they cannot be collected.

If the report does not pinpoint the issue, use the Dominator Tree. Sort by retained size to find the largest retained objects. For each large retainer, trace the shortest path to GC roots. This reveals what keeps the object alive.

Compare multiple dumps over time if possible. Objects that grow in count or retained size across dumps are likely leaks. The allocation site for growing objects tells you which code creates them.

3. What are GC roots and why are they important in heap analysis?

GC roots are objects that the garbage collector assumes are live without examining references to them. They include thread stack variables, static variables from loaded classes, JNI references, and objects in the JVM internal handles.

Because GC roots are always considered live, anything they reference is also live, transitively. When tracing why an object cannot be collected, you trace to a GC root. The object cannot be freed because some GC root still reaches it.

Common GC root types: local variables in thread stacks, static fields of classes, JNI global references, objects in the string table, class objects from loaded classes, and JVM internal references.

4. What is the difference between jmap and jcmd for capturing heap dumps?

Both can capture heap dumps, but jcmd is preferred in modern JVMs. jcmd GC.heap_dump uses the built-in diagnostic command interface, which is more reliable across JVM versions and provides better control over live object filtering.

jmap -dump directly invokes the JVM's Attach API. In some configurations, it may not work reliably, especially with the -F force option on hung processes. The -F flag also bypasses some safety checks.

For production use, the -XX:+HeapDumpOnOutOfMemoryError flag is best because it captures the dump at the exact moment of failure without requiring manual intervention.

5. How does MAT's Leak Suspect report work?

The Leak Suspect report uses several algorithms to identify problematic memory retention. It calculates the retained size of all objects, then identifies groups of objects that together contribute significantly to memory usage.

For each suspected leak, the report shows the "shortest path to GC roots"—the minimum chain of references from a GC root to the suspect objects. This tells you exactly what keeps those objects alive. It also estimates how much memory would be freed if the leak were fixed.

The report is generated automatically using pattern matching against known leak patterns, statistical analysis of retention chains, and heap dump comparison against typical object graphs.

6. What is the difference between heap dump analysis and allocation profiling for leak detection?

Heap dump analysis captures a point-in-time snapshot of all live objects. It shows what objects exist and what holds them, but does not show when they were created. Useful for finding what is retaining memory at the moment of the dump.

Allocation profiling records where objects are created over time. It shows which code paths allocate the most objects, which is different from showing what retains them. Allocation profiling is better for identifying leak sources (where they are created), while heap dumps are better for identifying leak sink (what holds them).

7. What is a dominator tree and how do you use it for memory analysis?

In the dominator tree, object A dominates object B if every path from any GC root to B passes through A. This means if A is collected, B would also become unreachable. The dominator tree shows ownership—each object's immediate dominator is its "parent" in the retention chain.

The biggest retained objects appear at the top level of the dominator tree (not dominated by anything larger). Use this to find the "owner" objects that cause other objects to be retained.

8. What are the limitations of heap dumps for memory analysis?

Heap dumps capture only live objects at a single moment in time. They do not show historical growth patterns—for that you need multiple dumps over time. Heap dumps also do not capture object allocation sites by default; you need YourKit or JMC for that.

Taking a heap dump itself can cause a long stop-the-world pause, especially with large heaps. This can worsen production incidents if triggered manually. Also, heap dumps do not include information about object fields that were garbage collected before the dump.

9. How do you analyze metaspace/classloader memory issues from heap dumps?

Metaspace is separate from the Java heap, but classloader instances and their associated metadata appear in heap dumps. Look for ClassLoader instances in the heap dump. For Classloader leaks, trace from the Classloader to all classes it loaded, then find what keeps the Classloader alive.

The leak pattern after web application redeployments is the old Classloader still referenced by something (usually ThreadLocal or a static cache). MAT's "Shortest Paths to GC Roots" from the Classloader reveals what is keeping it alive.

10. What is OQL and when would you use it in VisualVM?

OQL (Object Query Language) is a SQL-like language for querying heap dumps in VisualVM. You can use it to find objects with specific characteristics, such as all Strings containing certain text, collections over a certain size, or objects with particular field values.

Example: select s from java.lang.String s where s.toString().indexOf("password") >= 0 finds strings potentially containing passwords. OQL is powerful for ad-hoc analysis when the standard views do not surface the issue.

11. How do you identify a String interning memory issue from heap dumps?

String.intern() pools strings in the JVM string table. A leak pattern occurs when code interns strings that should not be interned, causing the string table (which is a GC root) to retain unlimited strings. In the heap dump, look for String instances with unexpectedly high retained set counts.

Large char[] arrays that back Strings with repeated content can indicate interning issues. Finding the source of incorrectly interned strings involves tracing from the string pool to what keeps those strings alive.

12. What is the difference between live dump and full dump?

A live dump (jmap -dump:live or jcmd GC.heap_dump live=true) triggers a Full GC before capturing, removing all objects that would be collected. This gives a "clean" snapshot of only live objects.

A full dump captures all objects including those ready for GC. Full dumps are larger and show more objects, but they include garbage that can obscure analysis. Always prefer live dumps for leak analysis; use full dumps only when you need to see object churn patterns.

13. How does YourKit's "Compare Heap Dumps" feature work?

YourKit's comparison feature takes two heap dumps from different points in time and shows what changed between them. It groups changes into "new objects," "removed objects," and "changed objects" categories. New objects that grow in count indicate potential leaks.

The comparison shows allocation sites for new objects, helping you trace leaks back to their source code. This is invaluable for confirming whether a memory issue is resolved after code changes.

14. What causes native memory leaks that do not show in Java heap dumps?

Native memory leaks come from direct ByteBuffers (using native memory outside the heap), JNI code, memory-mapped files, and JVM internal structures like the code cache. These do not appear in Java heap dumps.

Diagnose with -XX:NativeMemoryTracking=detail and jcmd VM.native_memory summary. For direct ByteBuffer leaks, look for phantom references to ByteBuffers that are not being cleaned up by the garbage collector.

15. How do you handle heap dumps from large heaps (100GB+) in production?

For very large heaps, consider your tooling carefully. MAT can handle 100GB+ dumps but needs significant heap size to open them (110-120% of the dump size). Running MAT with -Xmx120g may be necessary.

Alternative approaches: use jmap histo for quick class-level analysis (works without full dump), use async-profiler --liveobjs mode which is more memory-efficient, or use Java 11+ heap dump with G1GC which produces more compact dumps. Incremental dumps using kernel mechanisms or specialized dump tools can also help.

16. What is the difference between shallow heap and retained heap for collections?

For collections, shallow heap is the size of the collection object itself plus the size of the backing array or internal structure. Retained heap includes all elements the collection holds, transitively. A HashMap with 10,000 entries has small shallow heap but enormous retained heap.

When analyzing collection leaks, always look at retained heap to understand the true memory impact of the collection staying in memory.

17. How do you trace a memory leak to its source using allocation sites?

Allocation sites show where objects were created. In YourKit or JMC, right-click a leaking object and select "Show Allocation Stack Trace." This reveals the exact code path that created the leaking objects. Combined with retention analysis, you get both where they were created and what holds them.

Common patterns: objects created in one layer but retained in another, caches without eviction, or builders that accumulate results without releasing references.

18. What is the "Shallow Heap" vs "Retained Heap" distinction for understanding memory pressure?

Shallow heap tells you the base cost of an object if it were the only thing in memory. Retained heap tells you the actual memory impact of keeping that object alive. A small object holding a reference to a large cache has small shallow heap but large retained heap.

When deciding what to fix, prioritize high retained heap objects because fixing them frees the most memory.

19. How do you identify classloader leaks in application server environments?

Classloader leaks appear as classes from old application versions still being loaded after redeployment. In MAT, find ClassLoader instances and trace what keeps them alive—usually ThreadLocal values, static caches, or JMX MBeans registered by the old application.

In application servers, ensure implementations properly implement lifecycle methods (destroy, stop) to clean up references. Use the "Unloading of classes" view in MAT to see if classes are being unloaded properly.

20. What are the most common sources of memory leaks in Java applications?

The most common sources: unbounded caches (ConcurrentHashMap used as cache without eviction), static collections holding references, ThreadLocal values not removed, unclosed resources (streams, connections), finalizers creating references, and classloader leaks in application servers. Memory leak patterns also include listener registrations not removed, callback accumulation in event systems, and static factory instances holding references.

Systematic approach: take two heap dumps at interval, compare objects that grew, trace their retention paths to GC roots.

Further Reading

Conclusion

Heap dump analysis is the definitive method for diagnosing memory leaks and understanding object retention. Use -XX:+HeapDumpOnOutOfMemoryError in production to capture dumps automatically at failure. Eclipse MAT provides the most powerful free analysis with leak suspect reports and dominator tree views. VisualVM handles quick checks, and YourKit excels at allocation site analysis.

The key to effective heap analysis is capturing dumps at the right moment (during memory pressure) and tracing retention paths all the way to GC roots. Always distinguish retained size from shallow size, and protect heap dumps as sensitive data containing your application’s full memory state.

Category

Related Posts

Async-Profiler: Low-Overhead CPU and Memory Profiling

Learn async-profiler for low-overhead CPU and memory profiling in production. Generate flame graphs, analyze allocations, and diagnose JVM bottlenecks.

#jvm #async-profiler #cpu-profiling

Java Atomics and VarHandle: Low-Level Concurrency

Understanding Java atomic operations: AtomicInteger, AtomicReference, VarHandle, compareAndSet, atomics vs locks, and lock-free programming patterns.

#java #jvm #concurrency

JVM Bytecode Verification: Type Checking and Stack Map Frames

A technical deep dive into the JVM bytecode verifier, covering type checking, stack map frames, the four verification stages, and what happens when verification fails.

#java #jvm #bytecode