CMS and G1 Collectors: Low-Latency Garbage Collection

How CMS and G1 garbage collectors reduce pause times through concurrent marking, region-based heap layout, and incremental compaction.

published: reading time: 21 min read author: GeekWorkBench

CMS and G1 Collectors: Low-Latency Garbage Collection

Serial and Parallel collectors are honest about what they are: stop-the-world collectors that freeze your application while they work. CMS and G1 take a different approach - they try to do some of their work concurrently, while your application is still running. The result is shorter pauses, but at the cost of complexity and CPU overhead.

This covers how CMS and G1 work, why G1 replaced CMS as the default, and what trade-offs you make when you choose either.

Introduction

CMS (Concurrent Mark Sweep) and G1 (Garbage First) are the JVM’s low-latency garbage collectors — designed to keep pause times short by performing most GC work concurrently with application threads rather than stopping the world. Serial and Parallel collectors freeze your application for the entire GC duration; CMS and G1 reduce pause times significantly by overlapping marking, sweeping, and compaction with your application’s execution. G1 replaced CMS as the default collector in Java 9 because CMS had fundamental reliability issues (floating garbage, fragmentation requiring fallback to Serial), but understanding both is essential for tuning older JVMs and reasoning about the design decisions that led to G1.

The key technical difference from older collectors is that both CMS and G1 use concurrent marking phases where application threads and GC threads run simultaneously, and G1 additionally divides the heap into equal-sized regions to enable incremental compaction and more predictable pause times. CMS uses free-list allocation which fragments memory and eventually requires a stop-the-world compaction; G1 uses a region-based layout that allows it to collect only the regions with the most garbage (“garbage first”) and compact incrementally. This post covers the internal mechanics of both, their failure modes, and the concrete scenarios where you would choose one over the other.

When to Use This Knowledge

Use when:

  • Your application cannot tolerate multi-second GC pauses
  • You have latency SLAs measured in hundreds of milliseconds
  • Running server workloads where responsiveness matters (web servers, APIs, trading systems)
  • You have heap sizes over 4GB where full stop-the-world GC becomes painful

Do not use when:

  • Throughput is your only metric (batch processing, ETL)
  • Your heap is small (under 1GB) and Serial/Parallel work fine
  • You can already meet latency SLAs with Parallel GC

When NOT to Use This Knowledge

If your production environment runs ZGC or Shenandoah, most CMS and G1 tuning advice does not apply. These collectors work differently and require different approaches.

Java 21 LTS or later with strict latency SLAs? Start with ZGC. It eliminates pause time tuning complexity through automatic concurrent compaction. Time spent mastering G1 pause time targets when ZGC handles this automatically is opportunity cost wasted elsewhere.

CMS is gone, removed in Java 14. Learning CMS collector flags and failure modes is historical knowledge now unless you are maintaining legacy Java 8 systems. For new development, use modern collectors. For continued learning on modern JVM garbage collection, explore the Advanced Java & JVM Internals roadmap.

The Concurrent Approach

The core idea behind both CMS and G1 is simple: stop-the-world pauses are the enemy of latency. Instead of doing all GC work while the world is stopped, these collectors do as much work as possible concurrently - while application threads are running.

graph TB
    subgraph Serial["Serial/Parallel GC"]
        direction TB
        A1["Stop The World"] --> A2["GC Work"] --> A3["Stop The World"] --> A4["GC Work"]
    end
    subgraph Concurrent["CMS / G1"]
        direction TB
        B1["Initial Mark\n(STW - short)"] --> B2["Concurrent Mark"] --> B3["Remark\n(STW - short)"] --> B4["Concurrent Sweep"] --> B5["Concurrent Sweep"]
    end

The trade-off: doing GC work concurrently means using CPU cycles that could have gone to your application. You are buying shorter pauses with CPU time.

CMS Collector

CMS (Concurrent Mark Sweep) was the first low-latency collector in the JVM. It targets old generation GC, trying to collect it without long stop-the-world pauses.

How CMS Works

CMS phases:

  1. Initial Mark (Stop-The-World, short): Mark reachable objects from GC roots. Fast - usually under 100ms.

  2. Concurrent Mark: Traverse the object graph while application runs. Takes time but does not stop the world.

  3. Remark (Stop-The-World, short): Finalize marking of objects modified during concurrent phase (the “floating garbage” problem).

  4. Concurrent Sweep: Reclaim unmarked objects while application runs.

  5. Concurrent Reset: Prepare CMS data structures for next cycle.

graph TB
    A["Initial Mark\n(STW)"] --> B["Concurrent Mark\n(app running)"]
    B --> C["Remark\n(STW)"]
    C --> D["Concurrent Sweep\n(app running)"]
    D --> E["Concurrent Reset"]
    E --> A

Floating Garbage

CMS concurrent marking is imprecise. Objects that become unreachable during concurrent marking (modified by application threads) may not be reclaimed - they are called “floating garbage.” This garbage sits in the heap until the next CMS cycle.

This is a fundamental limitation of concurrent marking: you cannot stop the application while doing the full mark, so some objects slip through.

Key Characteristics

AspectBehavior
GC threadsMultiple + concurrent
Pause timeShort pauses for Initial Mark and Remark
ThroughputLower than Parallel - concurrent phases use CPU
Old gen collectionMostly concurrent
Young gen collectionSerial or Parallel Copying
Heap fragmentationModerate - no compaction by default
DeprecatedRemoved in Java 14 (replaced by G1)

CMS Failure Modes

CMS can fail in two main ways:

  1. Concurrent Mode Failure: CMS cannot finish collecting before old generation fills up. Triggers a full stop-the-world GC as fallback.

  2. Promotion Failure: Object in young generation cannot fit in old generation during minor GC, triggering full GC.

JVM Flags for CMS

-XX:+UseConcMarkSweepGC
-XX:ParallelCMSThreads=N     # CMS threads (default: (ncpus+3)/4)
-XX:CMSInitiatingOccupancyFraction=70  # Start CMS when old gen 70% full
-XX:+UseCMSInitiatingOccupancyOnly   # Always trigger at the threshold

G1 Collector

G1 (Garbage First) replaced CMS as the default collector in Java 9. It takes a fundamentally different approach: instead of treating the heap as one contiguous space, it divides the heap into many equal-sized regions.

graph TB
    subgraph G1Heap["G1 Heap - Divided into Regions"]
        direction TB
        R1["Region 1\n(Eden)"]
        R2["Region 2\n(Survivor)"]
        R3["Region 3\n(Old)"]
        R4["Region 4\n(Old)"]
        R5["Region 5\n(Eden)"]
        R6["Region 6\n(Humongous)"]
    end

How G1 Works

G1 divides the heap into regions of equal size (typically 1MB to 32MB depending on heap size). Each region can be Eden, Survivor, or Old generation - the designation is fluid and changes between GC cycles.

G1 tracks live object density per region. During a GC, it collects regions with the most garbage first - hence “Garbage First.”

graph TB
    subgraph G1Cycles["G1 GC Cycles"]
        A["Young GC\n(Copying - STW)"] --> B["Concurrent Mark\n(regions)"]
        B --> C["Mixed GC\n(Evacuate old regions - STW)"]
        C --> A
    end

G1 GC Phases

  1. Young GC (Stop-The-World): Copy live objects from young regions to survivor regions or promotes to old regions. Uses multiple threads.

  2. Concurrent Marking: Like CMS, G1 performs concurrent marking to identify live objects across all regions. Runs in background threads.

  3. Remark (Stop-The-World): Like CMS, finalizes marking of objects modified during concurrent phase.

  4. Mixed GC: After concurrent marking, G1 may collect a mix of young and old regions. G1 selects regions with lowest live ratio first.

  5. Cleanup: Frees completely empty regions, updates remembered sets.

Why G1 Replaced CMS

AspectCMSG1
Heap divisionContiguousRegions
CompactionNone by default (fragmentation)Incremental (per-region)
Full GCFalls back to Serial/ParallelG1 handles it
JDK supportRemoved in Java 14Default since Java 9
Large heapsCMS struggles >4GBScales well to large heaps
PredictabilityPoor (no pause target)Good (pause time target configurable)

G1’s region-based design lets it compact incrementally - it does not need to compact the entire old generation at once. This makes pause times more predictable and manageable.

Pause Time Target

G1 lets you set a target for maximum GC pause time:

-XX:MaxGCPauseMillis=200    # Target 200ms max pause

G1 then tunes its collection to try to meet this target. Note: this is a target, not a guarantee. Under memory pressure, G1 may not be able to meet it.

Key Characteristics

AspectBehavior
GC threadsMultiple + concurrent
Pause timeConfigurable target via MaxGCPauseMillis
ThroughputLower than Parallel GC (concurrent phases use CPU)
Heap layoutRegion-based (1MB to 32MB regions)
CompactionIncremental per-region
Young genDynamic - size adjusts between Min and Max
Large objectsHumongous regions handle objects > 50% of region size

JVM Flags for G1

-XX:+UseG1GC
-XX:MaxGCPauseMillis=200    # Target max pause time
-XX:G1HeapRegionSize=N      # Region size (1MB, 2MB, 4MB, 8MB, 16MB, 32MB)
-XX:InitiatingHeapOccupancyPercent=45  # Start GC when heap 45% full
-XX:G1ReservePercent=10     # Reserve 10% for promotion failures

Production Failure Scenarios

1. G1 Evacuation Failure

Symptom: GC pause for evacuation failure or to-space exhausted. Application freezes.

Cause: G1 ran out of free regions to copy survivor objects into. Happens when the to-space (target regions) fills up during a young or mixed GC.

Solution:

# Increase reserve space
-XX:G1ReservePercent=15

# Increase heap if possible
-Xms4g -Xmx4g

# Reduce expected pause time if target is too aggressive
-XX:MaxGCPauseMillis=500

2. CMS Concurrent Mode Failure

Symptom: CMS cannot complete before old generation fills. Full GC kicks in and freezes application.

Cause: Old generation filling faster than CMS can reclaim it. Usually means you need more heap, lower CMSInitiatingOccupancyFraction, or a different collector.

Solution:

# Start CMS earlier
-XX:CMSInitiatingOccupancyFraction=60

# Or just switch to G1
-XX:+UseG1GC

3. Humongous Object Issues (G1)

Symptom: Frequent long pauses with many humongous allocations.

Cause: Objects larger than half a G1 region (e.g., > 16MB with 32MB regions) are treated as “humongous” and collected differently. They can cause fragmentation in the humongous region space.

Solution:

# Increase region size if heap is large
-XX:G1HeapRegionSize=32m

# Avoid allocating very large buffers in hot paths

Trade-off Table

ConfigurationWhen to useTrade-off
-XX:+UseConcMarkSweepGCLegacy systems, Java 8 with low-latency needsDeprecated, no compaction, can fail badly
-XX:+UseG1GCDefault for most low-latency workloadsMore CPU overhead than Parallel GC
-XX:MaxGCPauseMillis=100Strict latency SLAMay reduce throughput if too aggressive
-XX:G1HeapRegionSize=16mLarge heaps (>16GB)Smaller regions = more regions to manage
-XX:CMSInitiatingOccupancyFraction=70Tune CMS trigger pointToo low = premature collections, too high = failure

Implementation Snippets

Checking Which Collector Is Running

import java.lang.management.*;
import java.util.*;

public class CollectorCheck {
    public static void main(String[] args) {
        List<GarbageCollectorMXBean> gcs = ManagementFactory.getGarbageCollectorMXBeans();
        for (GarbageCollectorMXBean gc : gcs) {
            System.out.println("Collector: " + gc.getName());
            System.out.println("  Cycles: " + gc.getCollectionCount());
            System.out.println("  Time: " + gc.getCollectionTime() + "ms");
        }
    }
}

Enabling G1 and Setting Pause Target

java -XX:+UseG1GC \
     -XX:MaxGCPauseMillis=150 \
     -XX:G1HeapRegionSize=16m \
     -XX:InitiatingHeapOccupancyPercent=45 \
     -Xms4g -Xmx4g \
     -Xlog:gc*:file=g1.log \
     -jar myapp.jar

Reading G1 GC Logs

[GC pause (G1 Evacuation Pause) 256M->245M(4G) 148.734ms]
[Ext Root Scanning: 23.4ms]
[Update RS: 12.1ms]
[Scan RS: 8.3ms]
[Object Copy: 89.2ms]

G1 logs break down pause time by phase, making it easier to identify bottlenecks.

Observability Checklist

  • Enable GC logging: -Xlog:gc*:file=g1.log or -Xlog:gc*:file=cms.log
  • Monitor pause times from GC logs - look for pauses exceeding your target
  • Track jstat -gc <pid> for OC (old gen capacity) vs OU (old gen used)
  • Watch for evacuation failures in G1 logs: to-space exhausted
  • Monitor CMS concurrent mode failure events
  • For G1: check G1HeapRegionSize is appropriate for your heap size
  • Use -XX:PrintGCDetails for phase-level timing breakdown

Security Notes

  1. Concurrent phases use more CPU, which can reveal allocation patterns through timing side channels
  2. GC logs expose memory usage patterns and object lifetime characteristics
  3. JMX access to GC beans should be restricted - can reveal collector internals and tuning
  4. Large heaps mean larger heap dumps on OOM - more data to protect

Common Pitfalls / Anti-Patterns

PitfallWhat happensFix
Setting pause target too lowG1 uses too much CPU trying to meet targetIncrease MaxGCPauseMillis or add more heap
Using CMS on Java 11+Removed from JDKMigrate to G1
Ignoring G1 region sizeSuboptimal performance on large heapsSet G1HeapRegionSize explicitly
Too small G1ReservePercentEvacuation failuresIncrease to 15-20
Setting old gen threshold too high (CMS)Concurrent mode failureLower CMSInitiatingOccupancyFraction

Quick Recap Checklist

  • CMS = concurrent mark-sweep, deprecated in Java 14, no compaction
  • G1 = region-based, incremental compaction, default since Java 9
  • Both do work concurrently to reduce pause times
  • Concurrent phases use CPU that could go to your application
  • G1 pause target set via -XX:MaxGCPauseMillis=N
  • G1 divides heap into 1MB-32MB regions
  • CMS fallback: concurrent mode failure triggers full stop-the-world
  • For Java 11+, use G1 - CMS is gone

Interview Questions

1. What is the fundamental difference between CMS and G1?

CMS treats the old generation as one contiguous space and performs mostly-concurrent marking and sweeping without compaction. G1 divides the entire heap into regions and performs incremental compaction - it compacts old generation region by region rather than all at once. This makes G1's pauses more predictable and prevents the fragmentation issues that plagued CMS.

2. What causes concurrent mode failure in CMS?

CMS cannot finish its concurrent marking and sweeping before the old generation fills up. When this happens, the JVM falls back to a full stop-the-world GC (typically Serial or Parallel). This is usually caused by the occupancy threshold being set too high, too many allocations requiring promotion, or the concurrent phases running too slowly relative to allocation rate. Fix by lowering -XX:CMSInitiatingOccupancyFraction or switching to G1.

3. How does G1 achieve its pause time target?

G1 tracks the amount of work needed to collect each region and estimates how long it will take. It builds a collection set of regions to collect, ordered by garbage density (garbage-first). It stops adding regions to the set once it estimates the collection will exceed the pause target. This means under memory pressure, G1 may not be able to collect enough regions and may not meet the target - it is a soft goal, not a guarantee.

4. What are humongous regions in G1?

Objects larger than half a G1 region size go into humongous regions. A 64MB region holds objects up to 32MB; anything bigger gets its own humongous region. These objects are collected during the concurrent marking cycle and cleanup, not during normal young or mixed GC. The catch is that large humongous objects can fragment the humongous region space - a known rough edge of G1.

5. Why was CMS deprecated and replaced by G1?

CMS had a fragmentation problem - no compaction meant free memory ended up scattered, eventually causing allocation failures even when total free memory looked fine. It fell apart on large heaps (4GB+). And when memory pressure hit, its fallback to full stop-the-world GC produced multi-second pauses that negated the whole point of using it. G1 solves all three: incremental compaction, scales to large heaps, and a more predictable fallback path.

6. What is the collection set in G1?

The collection set (CSet) is the set of regions G1 has chosen to collect during a GC cycle. G1 selects regions with the highest garbage density (most free space relative to live objects) first - hence "Garbage First." Young GC includes young regions; mixed GC includes both young and old regions. The CSet is built by estimating work per region and stopping when the pause target would be exceeded.

7. How does G1's incremental compaction differ from full compaction?

Full compaction (like in Parallel GC) slides all live objects in old generation together in one stop-the-world operation that can take seconds on large heaps. G1 compacts incrementally - it evacuates one region at a time during mixed GC. Each evacuation is short, and G1 spreads compaction work across multiple GC cycles. The trade-off is that G1 may take longer to fully compact, but pauses remain bounded and predictable.

8. What is the remembered set overhead in G1?

Each G1 region maintains a remembered set tracking references from other regions into that region. This adds memory overhead (roughly 10% of heap) but enables G1 to collect regions independently without scanning the entire heap. Without remembered sets, G1 would need to scan the whole heap for cross-region references, losing its incremental nature. The overhead is a worthwhile trade-off for predictable pause times.

9. What is the trade-off between MaxGCPauseMillis and throughput in G1?

Setting MaxGCPauseMillis aggressively (e.g., 50ms) makes G1 collect smaller batches more frequently. This uses more CPU for GC work but keeps pauses short. Your application gets less CPU time, reducing throughput. The target is a soft goal - if G1 cannot meet it within the target, it simply collects less and pauses may exceed the target. Relaxing the target (200-500ms) allows G1 to collect more per cycle, using less CPU and giving more to your application.

10. What is floating garbage in CMS and G1?

Floating garbage is objects that become unreachable during concurrent marking but are not reclaimed until the next cycle. Since concurrent marking runs while the application runs, objects modified during marking may become unreachable but are not caught. CMS has the worst floating garbage because it does not compact and may accumulate garbage across cycles. G1 handles it better through its cleanup phases.

11. Why does G1 divide the heap into fixed-size regions?

Fixed-size regions (1MB to 32MB) enable G1 to collect arbitrary subsets of the heap without contiguity constraints. Old generation does not need to be one block - it is a collection of old regions. Young generation can grow and shrink dynamically by adding or removing regions. This flexibility lets G1 balance young/old ratio dynamically and do incremental compaction without moving the entire old generation at once.

12. What is the relationship between G1HeapRegionSize and heap size?

G1HeapRegionSize must be a power of 2 between 1MB and 32MB. The JVM selects region size based on heap size: smaller heaps get 1MB regions, larger heaps get 2MB, 4MB, 8MB, 16MB, or 32MB. With very large heaps (16GB+), use 16MB or 32MB regions to avoid having millions of regions which increases remembered set overhead and management cost. With small heaps, 1MB regions provide finer-grained collection sets.

13. What is evacuation failure in G1?

Evacuation failure (to-space exhausted or to-space overflow) occurs when G1 tries to copy survivor objects during young or mixed GC but runs out of free regions. The objects that cannot be copied remain in place (not collected), and G1 may have to fall back to a slower path. Fix by increasing -XX:G1ReservePercent (more reserve regions), increasing total heap, or reducing pause target to collect more frequently with less work per cycle.

14. How does InitiatingHeapOccupancyPercent work in G1?

IHOP controls when G1 starts the concurrent marking cycle. When old generation occupancy exceeds this percentage, G1 initiates a concurrent mark in the background. The default is 45%. Lower values start marking earlier (more CPU overhead, less old gen headroom needed). Higher values delay marking (less CPU overhead, risk that old gen fills before marking completes, triggering a full GC). Tune based on allocation rate - workloads that fill old gen quickly need lower IHOP.

15. Why does CMS require more heap than G1 under memory pressure?

CMS does not compact, so it needs contiguous free space to allocate into. As fragmentation increases, CMS needs more total free memory to find space for new allocations - even though individual fragments are small. G1 compacts incrementally and can reclaim fragmented regions. Under the same memory pressure, CMS will hit concurrent mode failure with less fragmentation than G1 requires for evacuation failure.

16. What is the young generation collection strategy in G1 and how does it differ from previous collectors?

G1 performs young GC ( Eden + Survivor regions -> new Survivor/old regions) as a stop-the-world operation using multiple threads. Unlike Parallel GC which treats young generation as fixed spaces, G1 dynamically adjusts the number of young regions based on the pause time target. G1 collects young regions by evacuating live objects to survivor regions or promoting them to old regions. The pause time target controls how many regions are included in each young GC, making pause times more predictable than Parallel GC's all-or-nothing approach.

17. How does G1's concurrent marking differ from CMS's implementation?

G1's concurrent marking uses a snapshot-at-the-beginning (SATB) algorithm that marks objects that were live at the start of marking, even if they become unreachable during marking. This allows marking to proceed without slowing down application threads. CMS uses incremental update (or original) that requires re-scanning modifiied objects, which can miss some newly floating garbage. G1's SATB is faster but produces more floating garbage. G1 also integrates marking with young and mixed GC cycles, while CMS runs marking as a separate phase before sweeping.

18. What is the mixed GC phase in G1 and when does it occur?

Mixed GC occurs after concurrent marking completes, when G1 collects a mix of young regions and old regions with low live ratio. During a mixed GC, G1 evacuates live objects from the collection set (which now includes old regions) to other regions, updating references in the process. The number of mixed GC cycles is controlled by parameters. After enough mixed GC cycles, old gen has been cleaned up sufficiently and G1 returns to young-only collections until the next concurrent marking cycle.

19. How does G1 handle large objects larger than half a region size?

Objects larger than half a G1 region are called humongous objects. They are allocated into contiguous humongous regions that are either one region (if they fit in half a region) or multiple regions. Humongous objects are collected during concurrent marking and cleanup, not during normal young or mixed GC. They cannot be moved during GC because moving would require updating references in global data structures. This makes them a fragmentation risk and a collection priority in G1's cleanup phases.

20. What is the G1 Heap Region Size formula and how should you configure it for different heap sizes?

G1 Heap Region Size must be a power of 2 between 1MB and 32MB. The JVM calculates it based on minimum region size of 1MB and maximum of 32MB, with the goal of having 2048 regions (minimum). For heaps under 4GB, 1MB or 2MB regions work well. For heaps between 4GB and 16GB, 4MB or 8MB regions. For heaps above 16GB, 16MB or 32MB regions reduce the remembered set overhead since each region has its own remembered set. Having too many small regions increases bookkeeping overhead.

Further Reading

Conclusion

CMS achieves low-latency through concurrent marking and sweeping, but lacks compaction and is deprecated. G1 replaced CMS as the default by dividing the heap into regions and performing incremental compaction — it collects garbage-first regions to meet pause time targets. CMS is removed in Java 14+; use G1 for low-latency workloads on Java 9-13, and consider ZGC or Shenandoah for strict sub-millisecond SLA requirements.

Category

Related Posts

ZGC and Shenandoah: Ultra-Low Latency Garbage Collectors

How ZGC and Shenandoah achieve sub-millisecond pause times through concurrent operations and load barriers, without stopping your application.

#jvm #garbage-collection #zgc

GC Fundamentals: Mark-Compact, Copying, and Mark-Sweep

Understanding the three core garbage collection algorithms - Mark-Sweep, Mark-Compact, and Copying - their mechanics, trade-offs, and when to use each.

#jvm #garbage-collection #gc-algorithms

JVM GC Tuning: Heap Sizing and Threshold Optimization

Practical strategies for sizing JVM heap, tuning generation ratios, and optimizing GC thresholds to reduce pause times and improve throughput.

#jvm #garbage-collection #heap-tuning