Synchronous Communication: REST, gRPC, and When to Use Each

Explore synchronous communication patterns in microservices including REST APIs, gRPC, when to use each protocol, and their trade-offs.

published: reading time: 16 min read

Synchronous Communication: REST, gRPC, and When to Use Each

Microservices do not automatically solve your problems. They just move the difficulty around. One of the first design decisions you face is how services communicate. Synchronous communication is the most straightforward approach: a client sends a request, waits for the reply, then continues. Here I will explore the two main protocols for synchronous communication, where each one makes sense, and how to avoid turning a simple request into a cascade failure.

What is Synchronous Communication in Microservices?

The idea is straightforward. Service A calls Service B. Service A stops and does nothing until Service B responds. Only then does Service A resume.

This is the request-response model that software has used for decades before anyone used the word “microservice.” The appeal is predictability. You know immediately whether an operation succeeded or failed. Your business logic stays linear and easy to follow.

The problem is coupling. Service A depends on Service B being available and responsive. If Service B slows down, Service A waits. If Service B fails, Service A fails. This tight coupling is why synchronous systems fail so spectacularly when things go wrong.

Synchronous communication works well when operations are fast, services are reliable, and you need immediate consistency. The moment any of those assumptions break, you start dealing with timeouts, retries, and cascading errors.

REST over HTTP: When to Use It

REST is nearly everywhere. HTTP is not going anywhere, and every tool, language, and framework speaks it. You can test a REST endpoint with curl in your terminal. The format is human-readable JSON. No code generation required.

This ubiquity is REST’s main advantage. For public APIs consumed by third-party developers, it is the obvious choice. An external team can integrate without installing special tooling or learning a new protocol. The documentation writes itself because REST endpoints map naturally to resource-based URL structures.

Browser-based clients work naturally with REST. Browsers understand HTTP methods and status codes without help. You do not need a proxy layer or special configuration. This changes if you ever need to call services from browser JavaScript directly.

REST is also the better choice when schema flexibility matters. JSON accepts extra fields without breaking. You can evolve your API gradually without forcing all clients to update simultaneously. This matters in organizations where coordinating contract changes across teams takes time.

The tradeoff is that REST offers no compile-time checking. Rename a JSON field and you will not know something broke until runtime—probably in production, when a client sends the old field name and your code ignores it silently.

gRPC: When to Use It

Google built gRPC because REST left certain problems unsolved. It uses HTTP/2 for transport and Protocol Buffers for serialization. The combination handles higher throughput than JSON over HTTP/1.1 and enables patterns REST cannot support.

For internal services you control end-to-end, gRPC shines. You define your API contract once in a Protocol Buffer file. Code generation produces client libraries for every language your services use. When you change a field name, every consumer fails at compile time rather than runtime. Your CI pipeline catches breaking changes before they reach production.

Protocol Buffers produce smaller messages than JSON. HTTP/2 multiplexes multiple requests over a single connection, eliminating the head-of-line blocking that HTTP/1.1 suffers. For high-throughput services handling millions of requests per second, these optimizations add up.

Bi-directional streaming is gRPC’s most powerful feature. A client can send a stream of requests while receiving a stream of responses. This works for real-time data pipelines, collaborative editing, monitoring systems that push continuous updates. REST has no equivalent without workarounds like WebSockets or server-sent events.

The catch is browser support. You cannot call gRPC from browser JavaScript directly. You need a proxy like grpc-web that translates between the browser’s limited HTTP semantics and gRPC’s full capabilities. This is a real constraint if browser clients are part of your architecture.

Comparing REST and gRPC

The two approaches make different trade-offs. Here is how they compare on the factors that matter:

AspectRESTgRPC
SerializationJSON (human-readable)Protocol Buffers (binary, compact)
TransportHTTP/1.1 or HTTP/2HTTP/2 only
Schema EnforcementNoneStrong (generated code)
Browser SupportNativeLimited (needs grpc-web proxy)
StreamingServer-sent events, pollingNative bi-directional streaming
ToolingUniversalRequires code generation
DebuggingPlain text in logsBinary needs decoders
Contract EvolutionFlexible but riskyVersioned schemas

REST wins on debugging. You can paste a request into your terminal and read the response directly. gRPC payloads are binary. You need tooling to decode them. Early in development when you are iterating quickly, this matters.

gRPC wins on safety. Schema enforcement catches entire classes of bugs that JSON allows. When your CI pipeline fails because someone renamed a Protocol Buffer field, that is a feature. Runtime surprises are harder to debug than compile-time failures.

Browser clients tip the balance toward REST. If you need to call services from web browsers, gRPC requires additional infrastructure. Plan for this constraint early.

When to Use / When Not to Use Synchronous Communication

Trade-off Table

ScenarioUse SynchronousUse Asynchronous Instead
Need immediate consistencyREST or gRPCMessage queues, events
Operations complete in < 100msREST or gRPCConsider async overhead
Long-running operations (seconds+)Avoid syncWebhooks, callbacks, polling
Multiple services in a chainAdd timeouts, circuit breakersDecompose or use async
Fault isolation requiredAvoid deep chainsFire-and-forget events
High availability requirementAdd resilience patternsInherently more available
Cross-service transactionsAvoid, use sagasUse saga pattern

When to Use REST

Use REST when:

  • Building public APIs consumed by external developers
  • Browser-based clients need direct service access
  • JSON schema flexibility is needed for API evolution
  • Human readability matters for debugging
  • Rapid prototyping and iteration are priorities
  • Team lacks experience with code generation tools

Avoid REST when:

  • You need bi-directional streaming
  • Compile-time type safety is critical
  • Message size and performance are paramount
  • Internal services with shared contracts benefit from schema enforcement

When to Use gRPC

Use gRPC when:

  • Internal service-to-service communication you control
  • High throughput and low latency are requirements
  • Bi-directional streaming is needed (real-time pipelines, collaborative editing)
  • You want compile-time contract enforcement
  • Multiple languages need consistent client libraries

Avoid gRPC when:

  • Browser clients need to call services directly
  • Human debugging in transit is important
  • JSON-based legacy integration is required
  • Team lacks familiarity with Protocol Buffers

Request-Response Patterns

Synchronous calls follow patterns that determine how your services interact.

Point-to-point is the simplest case. One service calls another, waits, and continues. Fast operations that do not involve multiple services work well with this pattern.

Chained requests span multiple services in sequence. Service A calls Service B, which calls Service C. Latency accumulates across each hop. If any service slows down, the entire chain slows. If any service fails, the failure propagates back up. Deep call chains are fragile.

Scatter-gather fans out to multiple services simultaneously. A request goes to Service B, C, and D at the same time. The caller waits for all responses. This reduces total latency compared to chaining, but requires more infrastructure to manage the fan-out and handle partial failures.

Understanding these patterns helps you design APIs that match your reliability requirements. Not every operation belongs in a long chain.

Synchronous Failure Flow

Synchronous systems fail in predictable but dangerous ways. Here is what happens when Service C experiences latency.

sequenceDiagram
    participant C as Client
    participant SA as Service A
    participant SB as Service B
    participant SC as Service C
    participant DB as Database

    C->>SA: GET /order/123
    SA->>SB: Verify customer
    SB->>SC: Check credit limit
    SC-->>SB: (slow) Waiting...
    SB-->>SA: Timeout after 5s
    SA-->>C: 504 Gateway Timeout

    Note over SC: Service C recovers
    SC-->>SB: Credit OK
    SB-->>SA: Customer OK
    SA-->>C: (retry) GET /order/123
    SA->>DB: Fetch order
    DB-->>SA: Order data
    SA-->>C: 200 OK

In this cascade failure, Service C slows down and causes timeouts all the way back to the client. The client eventually retries and succeeds, but only after experiencing a failure.

Circuit Breaker Failure Flow

Circuit breakers prevent cascade failures by failing fast when a downstream service is unhealthy.

stateDiagram-v2
    [*] --> Closed: Normal operation
    Closed --> Open: Failure threshold exceeded
    Open --> HalfOpen: Timeout expired
    HalfOpen --> Closed: Probe succeeds
    HalfOpen --> Open: Probe fails

    state Closed {
        [*] --> Normal
        Normal --> HighLatency: Slow responses
        HighLatency --> Normal: Latency recovers
        HighLatency --> Failing: Failure threshold
        Failing --> Normal: Recovery succeeds
    }

Circuit breakers wrap synchronous calls and monitor failure rates. When failures exceed a threshold, the circuit opens and calls fail immediately without hitting the unhealthy service. After a cooldown period, a probe call tests whether the service has recovered.

Timeouts and Retry Considerations

Networks fail. Services crash. Load spikes cause timeouts. Your synchronous code must handle these cases explicitly.

Every synchronous call needs a timeout. Without one, a slow service blocks your service indefinitely. Setting timeouts requires knowing your SLAs and typical response times. Too short and you fail冤枉ly. Too long and you defeat the purpose of failing fast.

Start conservative and adjust based on production data. Monitor your p99 response times. If p99 is 200ms, a 500ms timeout gives room for spikes without waiting forever on genuine failures.

Retries recover from transient failures, but they amplify problems if not handled carefully. Exponential backoff prevents overwhelming a struggling service. Circuit breakers stop retry storms when a service is genuinely down.

Retries are not free. They consume resources on both sides. They can turn one service’s problem into a system-wide outage. When you retry, the same request may execute multiple times. Idempotency is essential—the operation must produce the same result regardless of how many times it runs.

# Timeout and retry example
import httpx
from tenacity import retry, stop_after_attempt, wait_exponential

@retry(stop=stop_after_attempt(3), wait=wait_exponential(multiplier=1, min=1, max=10))
async def call_service_with_retry(url: str) -> dict:
    async with httpx.AsyncClient(timeout=5.0) as client:
        response = await client.get(url)
        response.raise_for_status()
        return response.json()

Circuit Breaker Implementation

import time
import asyncio
from enum import Enum

class CircuitState(Enum):
    CLOSED = "closed"
    OPEN = "open"
    HALF_OPEN = "half_open"

class CircuitBreaker:
    def __init__(self, failure_threshold=5, timeout=60, recovery_timeout=30):
        self.failure_threshold = failure_threshold
        self.timeout = timeout
        self.recovery_timeout = recovery_timeout
        self.failure_count = 0
        self.last_failure_time = None
        self.state = CircuitState.CLOSED

    async def call(self, func, *args, **kwargs):
        if self.state == CircuitState.OPEN:
            if time.time() - self.last_failure_time > self.recovery_timeout:
                self.state = CircuitState.HALF_OPEN
            else:
                raise Exception("Circuit breaker is OPEN")

        try:
            result = await func(*args, **kwargs)
            self._on_success()
            return result
        except Exception as e:
            self._on_failure()
            raise e

    def _on_success(self):
        self.failure_count = 0
        self.state = CircuitState.CLOSED

    def _on_failure(self):
        self.failure_count += 1
        self.last_failure_time = time.time()
        if self.failure_count >= self.failure_threshold:
            self.state = CircuitState.OPEN

Point-to-Point Client Implementation

import httpx
from typing import Optional

class ServiceClient:
    def __init__(self, base_url: str, timeout: float = 5.0, max_retries: int = 3):
        self.base_url = base_url.rstrip("/")
        self.timeout = timeout
        self.max_retries = max_retries

    async def get(self, path: str, params: Optional[dict] = None) -> dict:
        url = f"{self.base_url}/{path.lstrip('/')}"
        async with httpx.AsyncClient(timeout=self.timeout) as client:
            for attempt in range(self.max_retries):
                try:
                    response = await client.get(url, params=params)
                    response.raise_for_status()
                    return response.json()
                except httpx.TimeoutException:
                    if attempt == self.max_retries - 1:
                        raise
                except httpx.HTTPStatusError as e:
                    if e.response.status_code >= 500:
                        if attempt == self.max_retries - 1:
                            raise
                    else:
                        raise

    async def post(self, path: str, json: Optional[dict] = None) -> dict:
        url = f"{self.base_url}/{path.lstrip('/')}"
        async with httpx.AsyncClient(timeout=self.timeout) as client:
            response = await client.post(url, json=json)
            response.raise_for_status()
            return response.json()

When Synchronous is the Wrong Choice

Synchronous communication couples services by availability and latency. That coupling has costs.

High latency operations do not fit synchronous patterns. If an operation takes seconds to complete, blocking the caller is impractical. A user staring at a spinner for thirty seconds is a bad experience. Asynchronous patterns—initiate the operation, poll for completion, or receive a callback—work better for long-running operations.

Cascade failures spread when one service failure propagates to others. Service A calls Service B, which calls Service C. If Service C slows down, Service B waits. Service A times out. Users see errors across the system even though only one service has a problem. Without circuit breakers and bulkheads, synchronous systems amplify failures.

Distributed transactions across multiple services are notoriously difficult with synchronous calls. When an operation spans services and must be atomic, synchronous rollback is messy. Asynchronous saga patterns handle this better, though they introduce their own complexity.

Loose coupling sometimes matters more than simplicity. If services need to evolve independently, adding a message broker decouples release cycles. Service A does not need to know when Service B deploys a new version. Asynchronous events let services communicate without direct knowledge of each other.

Evaluate these factors before defaulting to synchronous communication. The simplicity of request-response has hidden costs in the right scenarios.

Synchronous Request Flow

Here is what a synchronous request looks like in practice.

sequenceDiagram
    participant C as Client
    participant G as API Gateway
    participant S1 as Service A
    participant S2 as Service B
    participant DB as Database

    C->>G: HTTP Request
    G->>S1: Forward Request
    S1->>DB: Query Data
    DB-->>S1: Return Results
    S1->>S2: Call Service B
    S2-->>S1: Return Response
    S1-->>G: HTTP Response
    G-->>C: Return to Client

Each arrow in this diagram is a potential failure point and a source of latency. Monitoring helps identify where bottlenecks occur.

Conclusion

Synchronous communication is not obsolete. It is the right tool for problems where simplicity and immediate consistency matter more than loose coupling. REST remains the standard for external APIs and browser-facing services. gRPC delivers performance and type safety for internal service communication where you control both ends.

Most organizations end up using both. External-facing APIs in REST, internal gRPC for service-to-service calls. The key is matching the protocol to the constraints of each interaction.

Build resilience into synchronous systems from the start. Timeouts, retries, and circuit breakers are not optional add-ons. Without them, small failures become large outages.

Observability Hooks

Synchronous systems require explicit observability instrumentation. Unlike async systems where failures are queued, sync failures are immediate and visible.

Request Correlation

Every synchronous request should carry a correlation ID through the call chain.

import httpx
import uuid
from contextvars import ContextVar

correlation_id: ContextVar[str] = ContextVar("correlation_id", default="")

async def correlated_get(url: str, headers: dict = None) -> httpx.Response:
    cid = correlation_id.get()
    if not cid:
        cid = str(uuid.uuid4())
        correlation_id.set(cid)

    request_headers = {**(headers or {}), "X-Correlation-ID": cid}
    async with httpx.AsyncClient() as client:
        return await client.get(url, headers=request_headers)

Key Metrics to Track

MetricPurposeAlert Threshold
Request latency p50/p95/p99Baseline performancep99 > SLA
Error rate by endpointService health> 1% for 5min
Timeout rateDownstream health> 10%
Circuit breaker stateResilience activationOPEN state
Retry rateTransient failures> 20%

Logging Structured Data

import structlog
import time

logger = structlog.get_logger()

async def logged_service_call(service: str, operation: str, func, *args, **kwargs):
    start = time.time()
    correlation_id = correlation_id.get()

    logger.info(
        "service_call_started",
        service=service,
        operation=operation,
        correlation_id=correlation_id
    )

    try:
        result = await func(*args, **kwargs)
        duration = time.time() - start
        logger.info(
            "service_call_completed",
            service=service,
            operation=operation,
            duration_ms=int(duration * 1000),
            correlation_id=correlation_id
        )
        return result
    except Exception as e:
        duration = time.time() - start
        logger.error(
            "service_call_failed",
            service=service,
            operation=operation,
            duration_ms=int(duration * 1000),
            error=str(e),
            correlation_id=correlation_id
        )
        raise

Health Check Pattern

from fastapi import FastAPI
import httpx

app = FastAPI()

@app.get("/health")
async def health_check():
    checks = {}
    healthy = True

    # Check downstream services
    for service_name, service_url in downstream_services.items():
        try:
            async with httpx.AsyncClient(timeout=2.0) as client:
                response = await client.get(f"{service_url}/health")
                checks[service_name] = {"status": "up", "latency_ms": response.elapsed}
        except Exception as e:
            checks[service_name] = {"status": "down", "error": str(e)}
            healthy = False

    return {"status": "healthy" if healthy else "unhealthy", "checks": checks}

Quick Recap

graph LR
    Client -->|HTTP Request| Gateway
    Gateway -->|REST/gRPC| ServiceA
    ServiceA -->|Sync Call| ServiceB
    ServiceB -->|Response| ServiceA
    ServiceA -->|Response| Gateway
    Gateway -->|HTTP Response| Client

Key Points

  • Synchronous communication provides immediate consistency but creates tight coupling
  • REST offers human-readable debugging and universal browser support
  • gRPC provides type safety, bi-directional streaming, and compile-time contract enforcement
  • Always configure timeouts to prevent blocking on slow or failed services
  • Circuit breakers prevent cascade failures from spreading through the call chain
  • Retries amplify problems if not combined with idempotency and backoff
  • Correlation IDs enable tracing requests through the entire call chain

When to Choose Synchronous

  • Operations complete in under 100ms with predictable latency
  • You need immediate consistency between services
  • The call chain is shallow (2-3 services maximum)
  • Your team can manage resilience patterns consistently
  • Debugging simplicity outweighs loose coupling benefits

Production Checklist

# Synchronous Communication Production Readiness

- [ ] Timeouts configured for all outbound calls
- [ ] Retry logic with exponential backoff implemented
- [ ] Circuit breakers protecting downstream calls
- [ ] Correlation IDs propagated through call chains
- [ ] Health check endpoints on all services
- [ ] Structured logging with latency metrics
- [ ] Alerting configured for timeout and error rates
- [ ] Graceful degradation patterns in place
- [ ] Load shedding when downstream services are slow
- [ ] Request budgets limiting retry amplification

Category

Related Posts

Load Balancing Algorithms: Round Robin, Least Connections, and Beyond

Explore load balancing algorithms used in microservices including round robin, least connections, weighted, IP hash, and adaptive algorithms.

#microservices #load-balancing #algorithms

Amazon's Architecture: Lessons from the Pioneer of Microservices

Learn how Amazon pioneered service-oriented architecture, the famous 'two-pizza team' rule, and how they built the foundation for AWS.

#microservices #amazon #architecture

API Contracts: Design, Versioning, and Contract Testing

Master API contract design for microservices including OpenAPI specs, semantic versioning strategies, and automated contract testing.

#microservices #api-contracts #openapi