Server-Side Discovery: Load Balancer-Based Service Routing

Learn how server-side discovery uses load balancers and reverse proxies to route service requests in microservices architectures.

published: reading time: 27 min read author: GeekWorkBench

Server-Side Discovery: Load Balancer-Based Service Routing

In a microservices architecture, something has to figure out where requests should go. Every service instance has its own IP address, and those addresses change as containers spin up and crash. Figuring out the current location of healthy service instances is the service discovery problem.

If you are coming from the client-side approach, check out Client-Side Discovery for the alternative pattern. Many systems also rely on a Service Registry to track where instances live.

Server-side discovery moves the routing logic out of your clients and into a centralized component. Instead of clients querying a registry and deciding which address to call, they send requests to a known endpoint. A load balancer or reverse proxy sits in the middle, knows about available instances, and handles the routing.

This approach makes clients simpler. They skip the discovery libraries, skip the caching logic, skip the reconnection handling when instances change. The infrastructure takes care of all of that.

Introduction

The flow goes like this: a client sends a request to a well-known address, the load balancer receives it, looks up which service instances are currently available, picks one based on the configured algorithm, and forwards the request there.

graph TD
    Client[Client Application] --> LB[Load Balancer]
    LB --> Registry[Service Registry]
    LB --> S1[Service Instance 1]
    LB --> S2[Service Instance 2]
    LB --> S3[Service Instance 3]
    Registry -.->|health checks| S1
    Registry -.->|health checks| S2
    Registry -.->|health checks| S3

The load balancer maintains a connection to the service registry, either pulling data periodically or receiving updates when instances register and deregister. It uses this information to keep its routing table current.

Health checks are critical here. The registry or the load balancer itself periodically pings instances to verify they are still responding correctly. Unhealthy instances get removed from the routing pool automatically.

Load Balancer as the Discovery Point

Traditional load balancers like HAProxy, NGINX, or cloud-provided options (AWS ALB, GCP Cloud Load Balancer) fit the server-side discovery role naturally. You configure them with a virtual IP for a service, and they route requests to healthy backends. For more on load balancing fundamentals, see Load Balancing.

The service registry keeps track of available instances. When an instance starts, it registers itself with the registry. When it shuts down cleanly, it deregisters. If it crashes, health checks detect the failure and the registry marks it as unavailable.

Your client code does not care about any of that. It connects to the load balancer address and sends requests. The load balancer decides which backend gets the traffic.

graph LR
    A[Client] -->|1 request| B[Load Balancer]
    B -->|2 lookup| C[Service Registry]
    C -->|3 healthy instances| B
    B -->|4 forward| D[Selected Instance]

This decoupling has a practical benefit: clients become thinner. They do not need to know the location of every service, just the load balancer addresses for the services they use. Service addresses can change without touching client configuration.

For systems already using load balancers for traffic distribution, adding service discovery on top requires minimal new infrastructure.

Ingress Controllers in Kubernetes

Kubernetes ingress controllers implement server-side discovery for HTTP traffic. An ingress controller watches for Ingress resources in the cluster. When you create an Ingress that routes traffic to a service, the controller configures itself to forward requests to the appropriate pod endpoints. For a full walkthrough of Kubernetes concepts, see Kubernetes.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: api-ingress
spec:
  rules:
    - host: api.example.com
      http:
        paths:
          - path: /users
            pathType: Prefix
            backend:
              service:
                name: user-service
                port:
                  number: 80
          - path: /orders
            pathType: Prefix
            backend:
              service:
                name: order-service
                port:
                  number: 80

The ingress controller handles load balancing across all pods backing a service. It uses endpoints to track which pods exist and whether they are ready to receive traffic. When you scale a deployment, the Ingress automatically routes to the new pods once they become healthy.

Most ingress controllers also handle TLS termination, request rewriting, and canary deployments. They are the main entry point for external traffic into a Kubernetes cluster, solving the discovery problem for north-south traffic.

For east-west traffic within the cluster, Kubernetes services provide similar functionality through cluster IPs. The kube-proxy component handles routing, but it works at the network level rather than the HTTP level.

Python Implementation: NGINX Configuration Generator

In server-side discovery, the load balancer needs to dynamically update its configuration as service instances change:

import requests
import time
import json
from typing import Dict, List

class NginxConfigGenerator:
    """Generates NGINX upstream configuration from service registry."""

    def __init__(self, registry_url: str, nginx_config_path: str):
        self.registry_url = registry_url
        self.nginx_config_path = nginx_config_path

    def fetch_instances(self, service_name: str) -> List[Dict]:
        """Fetch healthy instances from service registry."""
        try:
            response = requests.get(
                f"{self.registry_url}/services/{service_name}/instances",
                timeout=5
            )
            response.raise_for_status()
            return [
                inst for inst in response.json()
                if inst.get("healthy", True)
            ]
        except requests.RequestException:
            return []

    def generate_upstream_block(self, service_name: str, instances: List[Dict]) -> str:
        """Generate NGINX upstream block for a service."""
        lines = [f"upstream {service_name} {{"]
        lines.append("    least_conn;")
        for inst in instances:
            host = inst.get("host", inst.get("ip", "127.0.0.1"))
            port = inst.get("port", 8080)
            lines.append(f"    server {host}:{port};")
        lines.append("}")
        return "\n".join(lines)

    def generate_location_block(self, service_name: str, public_path: str) -> str:
        """Generate NGINX location block routing to upstream."""
        return f"""
location {public_path} {{
    proxy_pass http://{service_name};
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}}
"""

    def update_nginx_config(self, services: Dict[str, str]):
        """Update NGINX configuration for all services.

        Args:
            services: Dict mapping service_name to public URL path
        """
        config_parts = ["# Auto-generated upstream blocks"]

        for service_name, public_path in services.items():
            instances = self.fetch_instances(service_name)
            if instances:
                config_parts.append(self.generate_upstream_block(service_name, instances))
                config_parts.append(self.generate_location_block(service_name, public_path))

        config_content = "\n\n".join(config_parts)

        with open(self.nginx_config_path, "w") as f:
            f.write(config_content)

        print(f"Updated NGINX config with {len(services)} services")


class LoadBalancerHealthMonitor:
    """Monitors backend health and removes unhealthy instances."""

    def __init__(self, upstream_config_path: str):
        self.upstream_config_path = upstream_config_path

    def check_backend_health(self, host: str, port: int, path: str = "/health") -> bool:
        """Check if a backend is healthy."""
        try:
            response = requests.get(
                f"http://{host}:{port}{path}",
                timeout=3
            )
            return response.status_code == 200
        except requests.RequestException:
            return False

    def get_unhealthy_backends(self, config_path: str) -> List[Dict]:
        """Parse NGINX config and check health of all backends."""
        # Simplified parsing - in production use nginx -t and parse output
        unhealthy = []
        # Implementation would parse config and check each backend
        return unhealthy

AWS ALB and NLB Integration Patterns

Amazon Web Services offers two main load balancing options for service discovery: the Application Load Balancer (ALB) and the Network Load Balancer (NLB). Both can serve as discovery points in a server-side architecture.

The ALB works at layer 7 and understands HTTP semantics. You can route based on URL paths, host headers, or query parameters. Target groups let you group instances or IP addresses behind the ALB. As you register and deregister targets, the ALB updates its routing automatically.

graph TD
    User[Internet User] --> ALB[AWS ALB]
    ALB --> TG1[Target Group: us-east-1]
    ALB --> TG2[Target Group: eu-west-1]
    TG1 --> I1[Instance]
    TG1 --> I2[Instance]
    TG2 --> I3[Instance]
    TG2 --> I4[Instance]

The NLB works at layer 4 and handles TCP/UDP traffic. It has lower latency and higher throughput than the ALB. For non-HTTP protocols or when raw performance matters most, the NLB is the better choice.

AWS also provides Amazon ECS Service Discovery and AWS Cloud Map for registering service instances. These integrate with ALB and NLB through target group registration. Your services register their IP addresses and ports with Cloud Map, and the load balancers automatically route to registered instances.

This tight integration between discovery and load balancing is a real advantage of cloud-managed solutions. You get health checking, routing, and registration without running your own infrastructure.

Service Mesh as a Discovery Layer

Service meshes like Istio and Linkerd implement server-side discovery through sidecar proxies. Every service instance gets a proxy sidecar that intercepts all incoming and outgoing traffic. The control plane manages routing rules, and the mesh handles service discovery without your application noticing. Read more in Service Mesh.

graph TD
    subgraph Cluster
        Client[Client Pod] --> Sidecar1[Envoy Sidecar]
        Sidecar1 --> Mesh[Service Mesh Control Plane]
        Sidecar1 --> S1[Service A]
        Sidecar1 --> S2[Service B]
        Sidecar2[Envoy Sidecar] --> S2
    end
    Mesh -.->|configures| Sidecar1
    Mesh -.->|configures| Sidecar2

The client sends requests to a logical service name. The sidecar intercepts the request, queries the control plane for available endpoints, and forwards to one of them. From the developer’s perspective, calling another service works exactly like making a regular network call.

Service meshes give you more than basic load balancers. They support circuit breaking, retries with budgets, traffic shifting for canary deployments, and mTLS between services. The cost is added complexity and resource overhead from running sidecar proxies on every node.

If you are already running Kubernetes, a service mesh fits naturally into the discovery picture. If you are on VMs or want simpler infrastructure, a traditional load balancer setup might be more appropriate.

Advantages of Server-Side Discovery

Client simplicity is the big win. Clients connect to a fixed address and do not worry about where services live. They skip discovery libraries, skip caching registry data, skip retry logic for registry unavailability. This makes client code easier to write and maintain.

Centralized routing logic means you can change how traffic flows without updating dozens of client applications. Want to shift 10% of traffic to a new version? Update the load balancer configuration. With client-side discovery, you would need to deploy updated clients to every service that calls that endpoint.

Consistency across clients follows from this. Every client uses the same routing algorithm, the same health check configuration, the same failover behavior. You get uniform behavior without relying on each team getting discovery right.

Observability improves when routing happens in a central place. You see all traffic through the load balancer, making it easier to spot anomalies, measure latency, and debug issues. Distributed tracing still helps, but you have a single point for metrics.

Disadvantages of Server-Side Discovery

Server-side discovery has real drawbacks.

An additional network hop adds latency. Every request goes through the load balancer, which then forwards to a backend. For low-latency systems, this overhead matters. Direct client-to-service communication cuts out the middleman.

The load balancer can become a bottleneck or a single point of failure. If the load balancer itself goes down, no requests get through. High availability configurations mitigate this, but they add complexity. You need redundant load balancers, floating IP addresses, and health checking for the load balancers themselves.

Operational complexity increases. You are now operating infrastructure that clients depend on. That infrastructure needs monitoring, capacity planning, and incident response. For small teams, this overhead can outweigh the benefits.

Less flexibility for clients matters in some scenarios. If you need very specific routing behavior that does not fit the load balancer model, you end up fighting the tooling. Client-side discovery lets each client implement exactly the logic it needs.

Comparing with Client-Side Discovery

Client-side discovery flips the model. Instead of sending requests to a load balancer, clients query a service registry directly, cache the results, and pick an instance themselves. See Client-Side Discovery for a detailed comparison.

graph TD
    Client[Client Application] --> Registry[Service Registry]
    Registry --> Client
    Client --> S1[Service Instance 1]
    Client --> S2[Service Instance 2]
    Client --> S3[Service Instance 3]

Netflix’s Eureka works this way. Services register with Eureka, and clients poll Eureka periodically to get the current list of instances. The client uses a round-robin or random algorithm to pick one.

Client-side discovery removes the load balancer hop, which can reduce latency. It also means there is no single component that blocks all traffic if it fails.

But client-side discovery puts more burden on each client. Libraries like Eureka client need to be integrated into every service. Caching logic needs to handle staleness. Reconnection logic needs to handle registry outages gracefully. Different clients might implement the same logic differently, leading to inconsistent behavior.

Server-side discovery centralizes that complexity. You write the routing logic once in the load balancer and let all clients benefit. The cost is accepting the load balancer as a required dependency.

For simple systems with few clients, client-side discovery might add unnecessary components. For large systems with many teams and hundreds of services, server-side discovery provides consistency and reduces per-client complexity.

When to Use / When Not to Use

When to Use Server-Side Discovery

Server-side discovery works well in these scenarios:

  • Kubernetes environments where ingress controllers and service meshes provide built-in server-side discovery
  • Multi-language service ecosystems where you want consistent routing without maintaining client libraries in each language
  • Centralized policy enforcement where you need uniform handling of canary deployments, blue-green releases, and traffic shaping
  • Operational teams handling routing where dedicated infrastructure teams can manage load balancers and proxies
  • Standard HTTP services where layer 7 load balancing with URL-based routing adds value

When Not to Use Server-Side Discovery

Server-side discovery has trade-offs. Consider alternatives when:

  • Latency is critical - every request goes through an additional hop (load balancer to service instead of client to service)
  • You have single-language microservices - client-side discovery may reduce infrastructure complexity
  • Small teams with simple needs - the operational overhead of managing load balancers may not be justified
  • High-throughput low-latency systems - the load balancer can become a bottleneck under extreme load
  • You need per-request routing logic - client-side discovery can make more nuanced routing decisions based on local state

Decision Flow

graph TD
    A[Service Discovery Approach] --> B{Running Kubernetes?}
    B -->|Yes| C[Ingress Controller / Service Mesh]
    B -->|No| D{AWS Environment?}
    D -->|Yes| E[ALB/NLB + Cloud Map]
    D -->|No| F{Multi-Language Services?}
    F -->|Yes| C
    F -->|No| G{Latency Critical?}
    G -->|Yes| H[Client-Side Discovery]
    G -->|No| I{Team Manages LB?}
    I -->|Yes| C
    I -->|No| H

Quick Recap Checklist

  • Server-side discovery uses load balancers and reverse proxies to handle routing, keeping clients simple
  • The load balancer queries the service registry and routes requests to healthy instances
  • In Kubernetes, ingress controllers provide HTTP-level routing; service meshes add east-west traffic management
  • On AWS, ALB/NLB combined with Cloud Map or ECS Service Discovery provides managed server-side discovery
  • Advantages: simpler clients, centralized policy enforcement, consistent routing behavior across all services
  • Disadvantages: additional network hop adds latency, load balancer can become a bottleneck or SPOF
  • Combine with health checks and high availability configurations for resilient routing

Real-world Failure Scenarios

ScenarioWhat HappensRoot CauseMitigation
Load balancer SPOFAll traffic stopsSingle LB instance failsRun HA configuration with floating IP
Registry out of syncLB routes to terminated instancesRegistry update not propagatedIncrease health check frequency
Config reload raceNew connections fail during reloadNGINX reloads drop connectionsUse connection draining
Target group limitCannot register new instancesAWS limit on targets per groupPre-configure larger limits
Cross-zone routingUnnecessary latency and costAZ failure forces cross-zoneConfigure AZ affinity

Trade-off Comparison: Server-Side Discovery Approaches

AspectNGINX/HAProxyAWS ALB/NLBEnvoy ProxyKubernetes Ingress
Protocol SupportHTTP/TCPHTTP/TCP/UDPHTTP/TCP/gRPCHTTP/HTTPS
Service Discovery IntegrationVia templateNative Cloud MapEDS APINative K8s API
Load Balancing AlgorithmsBasic (RR, leastconn)3 algorithmsManyBasic
Health CheckingPassive + activeActiveActiveActive
mTLS SupportVia configNativeNativeVia annotations
Circuit BreakingNoNoYesNo
Operational ComplexityMediumLowHighLow
CostSoftware cost (or free)AWS usage feesOpen sourceIncluded in K8s
Best ForTraditional deploymentsAWS-native deploymentsService meshKubernetes deployments

For continued learning, explore the Microservices Architecture Roadmap and System Design fundamentals.


Interview Questions

1. How does server-side discovery differ from client-side discovery in terms of network architecture and failure modes?

Server-side discovery adds a load balancer or reverse proxy between the client and service instances. The client sends requests to a well-known address (the proxy), and the proxy queries the service registry and routes to healthy instances. Client-side discovery has the client query the registry directly and pick an instance itself.

Failure modes differ: server-side discovery has a single point of failure in the load balancer (mitigated by HA configurations), while client-side discovery fails when the registry is unavailable but often recovers using cached data.

2. What are the advantages of using an ingress controller for server-side discovery in Kubernetes?

Ingress controllers provide HTTP-level routing with URL path and host-based routing. They integrate natively with Kubernetes—the controller watches Ingress resources and updates routing automatically as you scale or update deployments.

They also handle TLS termination, request rewriting, and canary deployments without additional components. The main entry point for external traffic into a cluster, ingress controllers solve discovery for north-south traffic while Kubernetes services handle east-west traffic.

3. How does a service mesh like Istio implement server-side discovery through sidecar proxies?

Service meshes deploy an Envoy sidecar alongside every service instance. The sidecar intercepts all incoming and outgoing traffic. The control plane configures routing rules, and the mesh handles service discovery without the application knowing.

The application calls other services by name (like making a regular HTTP call), and the sidecar intercepts, queries the control plane for available endpoints, and forwards to one of them. This gives you server-side discovery benefits without application changes.

4. What is the role of target groups in AWS ALB-based service discovery and how do they integrate with Cloud Map?

Target groups are groups of instances that receive traffic from an ALB. You register instances with a target group, and the ALB routes to instances in the group based on the configured algorithm.

AWS Cloud Map provides service registry functionality. When you integrate Cloud Map with ALB target groups, instances automatically register as targets when they start and deregister when they stop. No manual target management needed.

5. What health check mechanisms are commonly used in server-side discovery, and how do they differ?

TCP checks verify the port is accepting connections. HTTP checks call a health endpoint and verify the response code. HTTPS/TLS checks verify the TLS handshake completes and the certificate is valid. Custom script checks run application-specific verification logic.

Active health checks run periodically from the load balancer. Passive health checks analyze actual traffic patterns and mark instances unhealthy after error rate thresholds. Most load balancers support configuring thresholds for failure detection and recovery.

6. How does connection draining work in server-side discovery, and why is it important for deployments?

Connection draining allows in-flight requests to complete on an instance being removed, while stopping new requests from routing to it. The load balancer stops sending new traffic but keeps existing connections open until they finish or a timeout expires.

This matters for deployments because you can take instances offline for updates without dropping active requests. Without connection draining, users experience errors during deployments as their in-progress requests fail.

7. What is the relationship between server-side discovery and API gateways in microservices architecture?

API gateways combine server-side discovery with additional functionality: authentication, rate limiting, request transformation, and protocol translation. The gateway discovers services through the registry and routes requests.

You can think of API gateways as a superset of server-side discovery with added API management features. For simpler architectures, a basic load balancer handles discovery. For more complex API management needs, an API gateway provides more capabilities.

8. How does NGINX integrate with service registries like Consul to dynamically update upstream configuration?

Consul Template is the typical integration mechanism. Consul Template watches the Consul registry and regenerates NGINX configuration files when instance lists change. When an instance registers or deregisters, the template regenerates the upstream block and signals NGINX to reload.

The flow: Consul detects instance change → Consul Template generates new NGINX config → NGINX reloads with updated upstream. This provides near-real-time discovery without manual configuration changes.

9. What are the latency implications of server-side discovery compared to client-side discovery?

Server-side discovery adds an extra network hop: client → load balancer → service instead of client → service. For low-latency applications, this overhead matters.

However, the load balancer often runs on optimized hardware with local routing, so the hop adds microseconds rather than milliseconds. For most applications, this difference is imperceptible. For high-frequency trading or real-time gaming, client-side discovery might be preferred.

10. When should you prefer server-side discovery over client-side discovery for a new microservices deployment?

Prefer server-side discovery when you run Kubernetes and want native integration with ingress controllers. Prefer it for multi-language environments where maintaining client libraries in each language is difficult. Prefer it when you need centralized policy enforcement for canary deployments, blue-green releases, and traffic shaping.

Server-side discovery adds operational complexity but reduces per-service complexity. It is simpler for most teams to manage a central load balancer or ingress controller than to coordinate client library updates across many services.

11. How does server-side discovery handle service versioning and backward compatibility during deployments?

Service versioning in server-side discovery is typically handled at the routing layer. The load balancer or ingress controller maintains multiple versions of a service simultaneously and routes traffic based on request headers, URL paths, or weight percentages.

For example, you might route requests with a X-API-Version: v2 header to the v2 version while defaulting all other requests to v1. This enables gradual rollouts, A/B testing, and instant rollbacks without requiring client changes.

Service meshes extend this with traffic splitting rules that can split percentage-based traffic between versions, enabling controlled canary releases where a small fraction of users see the new version first.

12. What are the operational trade-offs between client-side and server-side service discovery in terms of deployment complexity?

Client-side discovery shifts complexity to service libraries—each language and framework in your ecosystem needs its own client library. Updating the discovery logic requires updating and redeploying every service. This works when you control all service code but becomes burdensome in polyglot environments.

Server-side discovery centralizes complexity in infrastructure. You deploy and maintain load balancers, ingress controllers, or service mesh control planes once. Services remain discovery-agnostic and simply connect to well-known addresses. The infrastructure team manages the discovery layer separately from application code.

The operational tradeoff: client-side discovery gives services more control over instance selection but multiplies the maintenance burden. Server-side discovery sacrifices fine-grained client control for simpler service code and centralized management.

13. How does service mesh-based discovery compare to traditional load balancer-based discovery in terms of observability?

Service meshes provide superior observability because every service-to-service call is intercepted by the sidecar proxy. This enables automatic collection of metrics like request rates, latencies, and error rates without requiring changes to application code.

Traditional load balancers offer limited observability—they can report on ingress traffic to a service but cannot see traffic between internal services. You would need to instrument each service individually to get the same visibility.

Service meshes also support distributed tracing through automatic header propagation (like W3C TraceContext), making it possible to trace a request across multiple services without manual instrumentation. This is significantly harder to achieve with traditional load balancer-based architectures.

14. What role does a service registry play in server-side discovery, and what happens when it becomes temporarily unavailable?

The service registry is the source of truth for instance locations. It tracks which instances are healthy and available. In server-side discovery, the load balancer or proxy queries the registry to build its routing table.

When the registry becomes unavailable, the discovery mechanism falls back to cached data. Most load balancers maintain a snapshot of healthy instances from the last successful query. This allows traffic to continue flowing to known-good instances even if new instances cannot be registered or unhealthy instances cannot be deregistered.

The risk: stale cached data means the load balancer might route traffic to instances that have already failed. Mitigations include aggressive health checks, short cache TTLs, and registry high availability configurations with multiple replicas.

15. How does geographic routing work with server-side discovery for multi-region deployments?

Geographic routing directs requests to the closest available instance based on client location. In server-side discovery, this is implemented through DNS-based routing or anycast IP addresses at the load balancer level.

AWS Global Accelerator and similar services provide anycast IPs that route to the nearest AWS region. Within a region, server-side discovery through ALB or Route 53 routing policies ensures traffic stays within that region to the nearest healthy instance.

For global deployments, the pattern is: client resolves a global endpoint, DNS routes to the nearest region, then server-side discovery within that region selects the optimal instance. This minimizes latency from client to first hop while maintaining all the benefits of server-side discovery within each region.

16. What is circuit breaking in the context of server-side discovery, and which implementations support it?

Circuit breaking prevents cascading failures by stopping requests to unhealthy instances. When error rates exceed a threshold, the circuit "opens" and subsequent requests fail fast rather than timing out waiting for failed backends.

Among server-side discovery tools, Envoy Proxy supports circuit breaking natively through its circuit breakers configuration. NGINX and HAProxy handle this through upstream configuration but lack built-in circuit breaker semantics. Service meshes like Istio provide circuit breaking through Envoy sidecar configuration.

17. How does mTLS work in service mesh-based server-side discovery, and what are its security benefits?

Mutual TLS authenticates both client and server using certificates. In a service mesh, the sidecar proxy terminates mTLS for all service-to-service communication. The control plane issues and rotates certificates automatically.

Benefits include: encrypted traffic between services, automatic authentication without application code changes, and certificate rotation without service restarts. This solves the "zero trust" requirement where every service must verify its communication partners.

18. What are the differences between active and passive health checks in server-side discovery?

Active health checks run periodically from the load balancer or sidecar proxy. The proxy sends requests or TCP probes to each instance on a defined interval and marks it unhealthy after consecutive failures.

Passive health checks analyze actual traffic patterns. If an instance returns too many errors or times out frequently, the proxy marks it unhealthy. Passive checks catch failures that might not be detected by periodic probing, such as slow responses or intermittent errors.

19. How does canary deployment work with server-side discovery, and what are the common traffic splitting strategies?

Canary deployments route a small percentage of traffic to new versions while most traffic goes to the stable version. With server-side discovery, the load balancer or sidecar proxy splits traffic based on weight rules.

Common strategies include: weighted routing (e.g., 95% stable, 5% canary), header-based routing (requests with a specific header go to canary), and geographic routing (certain regions get canary traffic). Most ingress controllers and service meshes support these patterns through traffic splitting configuration.

20. What happens during a rolling deployment with server-side discovery, and how does the system handle traffic during instance transitions?

During a rolling deployment, new instances start and register with the service registry. The load balancer detects them through health checks and adds them to the routing pool. Old instances deregister or fail health checks, and the load balancer stops routing traffic to them.

With connection draining enabled, in-flight requests complete on old instances before they shut down. Without it, some requests fail, requiring client-side retry logic. Blue-green deployments achieve zero-downtime by keeping the old version running until the new version is fully validated, then switching all traffic at once.

Server-side discovery typically uses reverse proxies to handle traffic routing. Understanding the options helps you choose the right approach.

NGINX as a Discovery Point

NGINX can dynamically update its upstream configuration based on registry data:

# Static configuration with Consul template
upstream backend {
    least_conn;
    server 10.0.1.10:8080;
    server 10.0.1.11:8080;
    server 10.0.1.12:8080;
}

# Consul template regenerates this block when instances change

The typical flow: Consul template watches the service registry and rewrites the NGINX config when instances change, then reloads NGINX.

Envoy as a Service Proxy

Envoy provides sophisticated server-side load balancing with built-in service discovery:

static_resources:
  listeners:
    - address:
        socket_address:
          address: 0.0.0.0
          port_value: 8080
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                route_config:
                  name: service_route
                clusters:
                  - name: my_service
                    type: EDS # External Discovery Service
                    lb_policy: LEAST_REQUEST
                    health_checks:
                      - timeout: 5s
                        interval: 10s
                        unhealthy_threshold: 3
                        healthy_threshold: 2

Envoy’s EDS (Endpoint Discovery Service) integrates with service registries to dynamically update cluster membership.

AWS ALB Target Groups

ALB integrates with AWS service discovery for automatic target registration:

# Register targets with ALB via target group
aws elbv2 register-targets \
    --target-group-arn arn:aws:elasticloadbalancing:us-east-1:123456789:targetgroup/my-service/abc123 \
    --targets "Id=10.0.1.10,Port=8080" "Id=10.0.1.11,Port=8080"

Cloud Map provides the registry, and ALB automatically routes to registered targets.

Service Router Patterns

A service router sits in front of multiple services and routes based on request attributes:

class ServiceRouter:
    def __init__(self, registry):
        self.registry = registry

    def route(self, request):
        path = request.path
        # Look up which service handles this path
        service_name = self.routes.get(path)
        if not service_name:
            return 404

        instances = self.registry.get_healthy(service_name)
        if not instances:
            return 503

        # Server-side load balancing
        selected = self.load_balancer.select(instances)
        return self.forward(request, selected)

Health Check Integration

Server-side discovery requires the proxy to perform health checks:

Check TypeWhat It VerifiesConfiguration
TCP connectPort accepting connectionshost:port, interval
HTTP GETHealth endpoint returns 200URL, expected status
HTTPS/TLSCertificate valid, service respondingTLS handshake
Custom scriptApplication-specific healthscript path

The proxy marks an instance unhealthy after consecutive failures and stops routing traffic until it passes health checks again.


Further Reading

Conclusion

Server-side discovery centralizes routing logic in load balancers, reverse proxies, and service meshes, simplifying client implementations at the cost of an additional network hop. The load balancer queries the service registry on behalf of clients and handles retry logic, health routing, and policy enforcement in one place.

Ingress controllers in Kubernetes handle north-south traffic discovery natively, while service meshes manage east-west traffic between services. AWS ALB/NLB integrated with Cloud Map provides managed server-side discovery without building your own infrastructure.

The tradeoff: server-side discovery adds latency through the intermediary hop and creates a potential single point of failure. Mitigate both with high availability configurations, health check integration, and appropriately aggressive timeouts.

For most microservice architectures, server-side discovery through an API gateway or service mesh strikes the right balance. Client-side discovery suits ultra-low-latency paths where every millisecond matters and your team can manage the added complexity.

Category

Related Posts

Client-Side Discovery: Direct Service Routing in Microservices

Explore client-side service discovery patterns, how clients directly query the service registry, and when this approach works best.

#microservices #client-side-discovery #service-discovery

DNS-Based Service Discovery: Kubernetes, Consul, and etcd

Learn how DNS-based service discovery works in microservices platforms like Kubernetes, Consul, and etcd, including DNS naming conventions and SRV records.

#microservices #dns #service-discovery

Service Registry: Dynamic Service Discovery in Microservices

Understand how service registries enable dynamic service discovery, health tracking, and failover in distributed microservices systems.

#microservices #service-registry #service-discovery