TCC: Try-Confirm-Cancel Pattern for Distributed Transactions

Learn the Try-Confirm-Cancel pattern for distributed transactions. Explore how TCC differs from 2PC and saga, with implementation examples and real-world use cases.

published: March 24, 2026 reading time: 16 min read

TCC: Try-Confirm-Cancel Pattern for Distributed Transactions

Two-phase commit blocks. Basic saga compensation gets messy when steps have complex dependencies. TCC (Try-Confirm-Cancel) offers a middle ground: more structure than plain saga, less blocking than 2PC.

The pattern splits every operation into three phases. Try reserves. Confirm commits. Cancel releases. Each service implements these three operations, and a coordinator orchestrates the flow.

Let me walk through how TCC actually works, where it fits, and the trade-offs that matter.

How TCC Works

TCC works by splitting every operation into three phases. Try reserves what you need. Confirm makes it permanent. Cancel releases what you reserved. Each service implements these three operations, and a coordinator orchestrates the flow.

sequence
    Coordinator->>ServiceA: Try(Reserve 5 units)
    ServiceA->>Coordinator: TryConfirmed
    Coordinator->>ServiceB: Try(Reserve $100)
    ServiceB->>Coordinator: TryConfirmed
    Coordinator->>ServiceA: Confirm
    ServiceA->>Coordinator: Confirmed
    Coordinator->>ServiceB: Confirm
    ServiceB->>Coordinator: Confirmed

When Try fails for any participant:

sequence
    Coordinator->>ServiceA: Try(Reserve 5 units)
    ServiceA->>Coordinator: TryConfirmed
    Coordinator->>ServiceB: Try(Reserve $100)
    ServiceB->>Coordinator: TryFailed(Review carefully)
    Coordinator->>ServiceA: Cancel
    ServiceA->>Coordinator: Cancelled

Idempotency matters at every phase. Services must handle duplicate Try calls gracefully. Confirm and Cancel also need to be idempotent, since the coordinator may retry if calls fail or get lost.

TCC vs Two-Phase Commit

TCC and 2PC both have three phases, but they work differently. In 2PC, participants lock resources during Prepare and hold those locks until Commit or Rollback. This blocking is the price of atomicity. In TCC, Try reserves but does not lock. Other operations can proceed against the same data, aware of pending reservations but not blocked by them.

The difference shows up under contention. Two competing transactions trying to reserve the same inventory: 2PC locks the rows and makes one wait or fail. TCC shows the second transaction a pending reservation and lets it decide what to do, whether that means queuing or picking an alternative.

2PC also assumes participants share a transaction manager. TCC works across heterogeneous systems because each service implements its own Try/Confirm/Cancel logic. Payment service, inventory service, shipping service, all different stacks, all coordinatable.

For a deeper look at 2PC and its limitations, see Two-Phase Commit.

TCC vs Basic Saga

Saga and TCC both avoid blocking, but failure handling differs. In a basic saga, each step has a compensation that undoes what the step did. If Step 3 fails, run compensation for Step 2, then Step 1. The compensation logic must know how to reverse each step’s effects.

TCC takes a different angle. Confirm and Cancel are explicit and symmetrical. Confirm commits the tentative reservation. Cancel releases it. The complexity shifts from writing reverse logic to implementing a reservation system that tracks pending operations.

TCC fits naturally when you can model operations as reservations. Hotel booking, inventory allocation, credit holds, seat reservations. These have a clear notion of “tentatively take this” and “make it official or release it.”

Saga fits better when operations are transformations rather than reservations. Moving money from account A to account B, transforming an order into an invoice. These lack natural reservation semantics and saga works fine there.

For more on saga, see Saga Pattern.

TCC vs 2PC vs Saga Comparison

Aspect	2PC	Saga	TCC
Blocking	Yes - participants block during commit	No - no blocking	No - no blocking
Locking	Locks resources during prepare	No locks	Reservations, not locks
Atomicity	True atomic commit	Eventual atomicity	Eventual atomicity
Isolation	Full serializable isolation	No isolation	No isolation
Coordination	Centralized coordinator	Centralized or choreographed	Centralized coordinator
Heterogeneous Systems	Requires shared TM	Yes	Yes
Compensation Model	Automatic rollback	Explicit compensations	Explicit Confirm/Cancel
Failure Handling	Blocking on coordinator crash	Compensations in reverse	Confirm/Cancel with retries
Latency	Two round trips minimum	Per-step latency	Two round trips minimum
Use Case Fit	Strong consistency required	Transformation operations	Reservation operations
Recovery Complexity	High (blocking states)	Medium (compensation chain)	Medium (tentative state cleanup)
Implementation Complexity	Medium (DB-supported)	Low-Medium	Medium-High (reservation design)

Implementing TCC in Practice

TCC requires a coordinator and participant implementations. Many frameworks handle the coordinator role. You focus on implementing Try, Confirm, and Cancel methods on your services.

A Flight Booking Example

Consider a flight booking system that coordinates an airline reservation, a hotel booking, and a car rental. All three must succeed or all three must be cancelled.

class FlightBooking:
    def try_reserve(self, flight_id, passenger_id, seats):
        # Check availability and tentatively hold seats
        reservation = Reservation(
            flight_id=flight_id,
            passenger_id=passenger_id,
            seats=seats,
            status="TENTATIVE"
        )
        self.reservations.save(reservation)
        return "TryConfirmed"

    def confirm(self, flight_id, passenger_id):
        # Make the tentative reservation permanent
        reservation = self.reservations.find(flight_id, passenger_id)
        reservation.status = "CONFIRMED"
        self.reservations.save(reservation)
        return "Confirmed"

    def cancel(self, flight_id, passenger_id):
        # Release the tentative hold
        reservation = self.reservations.find(flight_id, passenger_id)
        reservation.status = "CANCELLED"
        self.reservations.save(reservation)
        return "Cancelled"

The coordinator orchestrates the three-phase flow:

class BookingCoordinator:
    def __init__(self, flight, hotel, car):
        self.flight = flight
        self.hotel = hotel
        self.car = car

    def book_trip(self, flight_id, hotel_id, car_id, passenger):
        # Try phase
        results = []
        results.append(self.flight.try_reserve(flight_id, passenger, 1))
        results.append(self.hotel.try_reserve(hotel_id, passenger, 1))
        results.append(self.car.try_reserve(car_id, passenger, 1))

        if all(r == "TryConfirmed" for r in results):
            # Confirm phase
            self.flight.confirm(flight_id, passenger)
            self.hotel.confirm(hotel_id, passenger)
            self.car.confirm(car_id, passenger)
        else:
            # Cancel phase
            self.flight.cancel(flight_id, passenger)
            self.hotel.cancel(hotel_id, passenger)
            self.car.cancel(car_id, passenger)

This example omits error handling, timeouts, and duplicate detection. A production implementation needs retry logic, idempotency keys, and timeout handlers for when participants fail to respond.

Handling Failures and Timeouts

TCC assumes participants will eventually respond to Try, Confirm, or Cancel calls. When a participant becomes unresponsive, the coordinator must decide what to do. This is where TCC implementations diverge.

Some frameworks use guaranteed delivery. They store the intended action in a log and retry until the participant acknowledges. Others use a maximum retry count and then flag the transaction as requiring manual intervention.

The tricky case is when Try succeeded but Confirm failed. The participant reserved the resource but never received the confirmation. From the participant’s perspective, it has a tentative reservation waiting to be confirmed or cancelled. The coordinator’s retry of Confirm should eventually clear this state. But if the coordinator crashed entirely, you need a recovery process that queries participants about their pending states.

graph TD
    A[Coordinator calls Confirm] --> B{Participant reachable?}
    B -->|Yes| C[Confirm succeeds]
    B -->|No| D[Store in retry queue]
    D --> E[Retry with backoff]
    E --> F{Participant responds?}
    F -->|Yes| C
    F -->|No| G[Max retries exceeded]
    G --> H[Flag for manual review]

Complete TCC Flow Diagram

flowchart TD
    Start[TCC Transaction] --> TryPhase[Coordinator sends<br/>Try to all participants]
    TryPhase --> TryResults{All Try succeed?}
    TryResults -->|No| CancelPhase[Coordinator sends<br/>Cancel to all participants]
    CancelPhase --> CancelDone[Resources released<br/>Transaction aborted]
    TryResults -->|Yes| ConfirmPhase[Coordinator sends<br/>Confirm to all participants]
    ConfirmPhase --> ConfirmResults{All Confirm succeed?}
    ConfirmResults -->|No| ConfirmRetry[Retry with backoff]
    ConfirmRetry --> ConfirmResults
    ConfirmResults -->|Yes| Success[Transaction committed<br/>All reservations finalized]
    TryPhase --> Timeout{Participant times out?}
    Timeout -->|Yes| CancelPhase
    Timeout -->|No| TryResults

Three Main Scenarios:

Scenario	Trigger	Coordinator Action	Outcome
Try -> Confirm success	All participants respond TryConfirmed	Send Confirm to all	All reservations become permanent
Try -> Cancel	Any participant responds TryFailed	Send Cancel to all	All tentative reservations released
Try timeout -> Cancel	Participant times out on Try	Send Cancel to all	All tentative reservations released

State Transitions for a Single Participant:

stateDiagram-v2
    [*] --> Idle: Transaction starts
    Idle --> Tentative: Try succeeds
    Tentative --> Confirmed: Confirm received
    Tentative --> Cancelled: Cancel received
    Tentative --> Tentative: Try timeout, waiting for Cancel
    Confirmed --> [*]
    Cancelled --> [*]

TCC Frameworks

Building TCC from scratch means managing coordinator state, retry logic, timeout handling, and recovery — all nontrivial. Several frameworks handle the heavy lifting.

Apache TCM (Transaction Coordinator Manager)

Apache TCM is the reference implementation for J2EE-style TCC. It integrates with application servers and provides declarative transaction boundaries. Best for Java/J2EE shops already invested in that ecosystem.

Narayana (JBossTS)

Narayana is an open-source transaction manager supporting LRC (Last Resource Commit) optimization, 2PC, and TCC. It provides both programmatic and declarative (annotation-based) approaches. Works well with Spring via Narayana’s Spring integration.

@Compensable(compensationMethod = "cancelReservation")
public void tryReserveSeats(ReservationRequest request) {
    // Tentatively reserve seats
    reservationService.createTentativeReservation(request);
}

public void cancelReservation(ReservationRequest request) {
    // Release the tentative hold
    reservationService.cancelReservation(request.getReservationId());
}

public void confirmReservation(ReservationRequest request) {
    // Finalize the reservation
    reservationService.confirmReservation(request.getReservationId());
}

ByteTCC

ByteTCC is a TCC implementation for Spring applications. It uses annotations to define Try/Confirm/Cancel methods and handles coordinator logic transparently. Lightweight and Spring-native, good for microservices running in Spring Boot.

@Compensable(confirmMethod = "confirm", cancelMethod = "cancel")
public boolean tryReserveInventory(InventoryRequest request) {
    // Try logic: check availability, create tentative hold
    return inventoryService.tentativeHold(request.getItemId(), request.getQuantity());
}

public void confirm(InventoryRequest request) {
    inventoryService.confirmHold(request.getItemId(), request.getQuantity());
}

public void cancel(InventoryRequest request) {
    inventoryService.releaseHold(request.getItemId(), request.getQuantity());
}

Spring TCC (Spring-Cloud-tencent)

Spring TCC is part of the Tencent Spring Cloud stack. Integrates with Service Comb and provides distributed TCC transaction support for Spring Cloud microservices.

Framework Comparison

Framework	Language	Coordinator	Spring Integration	Recovery Support	Best For
Apache TCM	Java	Embedded	Yes (J2EE)	Yes	Enterprise Java apps
Narayana	Java/C	Both	Yes	Yes	JBoss/Spring ecosystem
ByteTCC	Java	External	Yes (Spring Boot)	Yes	Lightweight Spring microservices
Spring TCC	Java	External	Yes	Yes	Tencent/Spring Cloud stack

For most new projects, ByteTCC or Narayana are the practical choices. ByteTCC is simpler and more Spring Boot-friendly. Narayana has more enterprise features and longer track record.

Confirm/Cancel Idempotency Implementation

Idempotency is not optional in TCC — it is load-bearing. The coordinator retries Confirm and Cancel calls until it gets a response. Your participant must handle duplicates gracefully.

The Idempotency Problem

Consider this scenario:

Coordinator calls confirm(reservation_id="abc")
Participant confirms successfully but the network drops before the response arrives
Coordinator retries confirm(reservation_id="abc")
If your confirm handler is not idempotent, you might re-confirm an already-confirmed reservation

Idempotency Key Design

Use a dedicated idempotency key for each Try/Confirm/Cancel call. The key should be deterministic — the same operation always gets the same key.

import hashlib

def make_idempotency_key(transaction_id, participant_id, phase):
    """Generate a deterministic idempotency key.

    Same transaction + participant + phase always produces same key.
    """
    raw = f"{transaction_id}:{participant_id}:{phase}"
    return hashlib.sha256(raw.encode()).hexdigest()[:16]

class TccParticipant:
    def confirm(self, transaction_id, participant_id, reservation_data):
        key = make_idempotency_key(transaction_id, participant_id, "confirm")

        # Idempotency check
        existing = self.confirm_log.find_by_idempotency_key(key)
        if existing:
            # Already confirmed — return success without re-confirming
            return ConfirmResult(
                success=True,
                already_confirmed=True,
                confirmed_at=existing.confirmed_at
            )

        # Actual confirmation logic
        reservation = self.reservations.find(reservation_data.id)
        reservation.status = "CONFIRMED"
        self.reservations.save(reservation)

        # Record this confirmation for future idempotency
        self.confirm_log.save(IdempotencyRecord(
            key=key,
            transaction_id=transaction_id,
            confirmed_at=datetime.utcnow()
        ))

        return ConfirmResult(success=True, already_confirmed=False)

    def cancel(self, transaction_id, participant_id, reservation_data):
        key = make_idempotency_key(transaction_id, participant_id, "cancel")

        existing = self.cancel_log.find_by_idempotency_key(key)
        if existing:
            return CancelResult(
                success=True,
                already_cancelled=True,
                cancelled_at=existing.cancelled_at
            )

        reservation = self.reservations.find(reservation_data.id)
        reservation.status = "CANCELLED"
        self.reservations.save(reservation)

        self.cancel_log.save(IdempotencyRecord(
            key=key,
            transaction_id=transaction_id,
            cancelled_at=datetime.utcnow()
        ))

        return CancelResult(success=True, already_cancelled=False)

Confirm Before Cancel Problem

A subtler idempotency problem: what if Confirm runs twice (first times out, second succeeds), and then Cancel is retried? The cancellation would incorrectly release a confirmed reservation.

Track state transitions explicitly. Confirm transitions from TENTATIVE to CONFIRMED. Cancel transitions from TENTATIVE to CANCELLED. Once in CONFIRMED, Cancel should be a no-op, not a failure.

def cancel(self, transaction_id, participant_id, reservation_data):
    reservation = self.reservations.find(reservation_data.id)

    if reservation.status == "CONFIRMED":
        # Already confirmed — Cancel is correctly a no-op
        return CancelResult(success=True, reason="already_confirmed")

    if reservation.status == "CANCELLED":
        # Already cancelled — still a no-op
        return CancelResult(success=True, reason="already_cancelled")

    # Actual cancellation from TENTATIVE state
    reservation.status = "CANCELLED"
    self.reservations.save(reservation)
    return CancelResult(success=True)

Timeout vs Permanent Failure

TCC distinguishes between transient failures (retry might succeed) and permanent failures (never going to succeed). In your Try handler:

Transient failure: Return a retryable error, coordinator retries
Permanent failure: Return TryFailed with a reason that means “do not retry, cancel everything”

def try_reserve(self, inventory_id, quantity):
    try:
        # Try logic
        reserved = self.inventory.tentative_hold(inventory_id, quantity)
        return TryResult(success=True, reservation_id=reserved.id)
    except InsufficientInventory:
        # Permanent failure — not retrying will help
        return TryResult(success=False, reason="INSUFFICIENT_INVENTORY")
    except TemporaryDatabaseError:
        # Transient failure — worth retrying
        raise TryRetryableError("Database temporarily unavailable")
    except CapacityExceeded:
        # Permanent failure — no amount of retry will fix this
        return TryResult(success=False, reason="CAPACITY_EXCEEDED")

Advantages of TCC

The main advantage is that resources do not lock during the transaction. Other operations can read or modify the same data, aware of pending reservations but not blocked by them. This makes TCC more scalable than 2PC, particularly under high contention.

The three-phase structure is explicit. Every participant agrees to the contract: if you can reserve in Try, you guarantee you can confirm or cancel later.

TCC also works across heterogeneous systems. No shared transaction manager required. Each service implements its own semantics for Try, Confirm, and Cancel.

Limitations and Challenges

TCC is not a silver bullet.

The biggest challenge is designing Try/Confirm/Cancel for your specific domain. Not all operations map naturally to reservation semantics. Forcing a square peg into a round hole produces brittle implementations.

Idempotency trips people up. Confirm and Cancel must handle duplicate calls gracefully. If the coordinator retries a Confirm that actually succeeded, the participant needs to recognize this and return Confirmed, not try to confirm again.

The timeout case requires care. Try succeeds but the coordinator crashes before sending Confirm or Cancel. Resources sit in a tentative state. Without a resolution mechanism, you get resource leaks that pile up silently.

Latency also increases. Every transaction needs at least two round trips to each participant.

When to Use TCC

TCC fits well when your business logic naturally supports reservation semantics. Inventory allocation, booking systems, credit reservations, seat holds. If you can model the operation as “tentatively take X and later either commit or release,” TCC gives you a clean structure.

TCC gets awkward when operations are transformations rather than reservations. If Step 2 depends on the output of Step 1 in a way that does not fit reservation semantics, you end up stuffing intermediate state into Try and carrying it forward to Confirm. This works but loses the elegance.

For an overview of distributed transaction patterns including TCC, see Distributed Transactions. For reliable message delivery in distributed systems, see the Outbox Pattern.

Observability Checklist

TCC transactions span multiple services and involve multiple round trips. Without observability, you cannot tell whether a failed transaction left dangling tentative reservations.

Metrics

TCC transaction completion rate (success vs try-fail vs confirm-fail vs cancel)
Try phase duration and success rate
Confirm phase duration and retry count
Cancel phase duration and how often it runs
Average number of participants per transaction
Timeout rate per phase (try timeout, confirm timeout)
Dangling tentative reservation count (reservations stuck in TENTATIVE state)

Logs

Log Try phase start with transaction ID, participant ID, and reservation data
Log Try phase outcome (confirmed, failed, timeout)
Log Confirm/Cancel phase starts and outcomes
Log retry attempts with attempt number and delay
Include idempotency key in all phase logs for correlation
Log participant state transitions (TENTATIVE → CONFIRMED, TENTATIVE → CANCELLED)

Alerts

Alert when dangling TENTATIVE reservations accumulate (cleanup is failing)
Alert when confirm retry count exceeds threshold
Alert when cancel phase runs frequently (indicates try phase instability)
Alert when participant times out repeatedly on try phase
Alert when transaction takes longer than expected threshold

Tracing

from opentelemetry import trace

tracer = trace.get_tracer(__name__)

class TccTransaction:
    def execute(self):
        with tracer.start_as_current_span("tcc.transaction") as span:
            span.set_attribute("tcc.transaction_id", self.txn_id)
            span.set_attribute("tcc.participant_count", len(self.participants))

            # Try phase
            try_results = []
            with tracer.start_as_current_span("tcc.try_phase") as try_span:
                for participant in self.participants:
                    with tracer.start_as_current_span(f"tcc.try.{participant.name}") as p_span:
                        p_span.set_attribute("participant.name", participant.name)
                        result = participant.try_(self.request)
                        try_results.append(result)
                        p_span.set_attribute("tcc.try_result", result)

            if all(r.success for r in try_results):
                # Confirm phase
                with tracer.start_as_current_span("tcc.confirm_phase") as confirm_span:
                    for participant in self.participants:
                        with tracer.start_as_current_span(f"tcc.confirm.{participant.name}") as p_span:
                            result = participant.confirm()
                            p_span.set_attribute("tcc.confirm_result", result)
            else:
                # Cancel phase
                with tracer.start_as_current_span("tcc.cancel_phase") as cancel_span:
                    for participant in self.participants:
                        with tracer.start_as_current_span(f"tcc.cancel.{participant.name}") as p_span:
                            result = participant.cancel()
                            p_span.set_attribute("tcc.cancel_result", result)

Security Checklist

TCC coordination involves multiple services making state changes. Security misconfigurations can lead to unauthorized reservations or data leakage.

Authenticate the coordinator-to-participant RPC calls (mutual TLS or JWT tokens)
Authorize participants — coordinator should only call Confirm/Cancel on registered participants
Validate reservation data in Try phase — do not trust coordinator-supplied quantities or IDs without validation
Audit log all state transitions on tentative reservations (created, confirmed, cancelled)
Encrypt coordinator-to-participant communication in transit
Do not expose internal transaction IDs in error responses (use correlation IDs instead)
Rate-limit Try requests per participant to prevent reservation exhaustion attacks
Set TTL on tentative reservations so abandoned transactions auto-expire

Reservation Exhaustion Attack

A subtle TCC security concern: an attacker triggers many Try operations that succeed but never Confirm or Cancel. If tentative reservations hold inventory, the attacker can exhaust available inventory without paying.

Mitigations:

TTL on tentative reservations: Auto-cancel after timeout
Per-entity locking: Lock the reservation entity itself, not just the inventory
Rate limiting Try: Limit how many Try requests a single client can make
Verification on Confirm: Check the original request is still valid before confirming

Conclusion

TCC gives you a structured way to coordinate distributed transactions without blocking. The three-phase model makes the contract explicit: reserve, commit, release. When your domain fits the reservation pattern, you get clean separation of concerns and better scalability than 2PC.

The trade-offs are real. Idempotent operations, timeout handling, recovery logic for dangling reservations. For high-contention scenarios with natural reservation semantics, TCC is worth the implementation effort. For simpler saga flows or operations that do not fit reservation semantics, basic saga or choreography may be the better choice.

See also Event-Driven Architecture for patterns that complement TCC in microservices ecosystems.

TCC: Try-Confirm-Cancel Pattern for Distributed Transactions

TCC: Try-Confirm-Cancel Pattern for Distributed Transactions

How TCC Works

TCC vs Two-Phase Commit

TCC vs Basic Saga

TCC vs 2PC vs Saga Comparison

Implementing TCC in Practice

A Flight Booking Example

Handling Failures and Timeouts

Complete TCC Flow Diagram

TCC Frameworks

Apache TCM (Transaction Coordinator Manager)

Narayana (JBossTS)

ByteTCC

Spring TCC (Spring-Cloud-tencent)

Framework Comparison

Confirm/Cancel Idempotency Implementation

The Idempotency Problem

Idempotency Key Design

Confirm Before Cancel Problem

Timeout vs Permanent Failure

Advantages of TCC

Limitations and Challenges

When to Use TCC

Observability Checklist

Metrics

Logs

Alerts

Tracing

Security Checklist

Reservation Exhaustion Attack

Conclusion

Category

Tags

Related Posts

The Outbox Pattern: Reliable Event Publishing in Distributed Systems

Distributed Transactions: ACID vs BASE Trade-offs

CQRS Pattern