System Design: Netflix Architecture for Global Streaming

Deep dive into Netflix architecture. Learn about content delivery, CDN design, microservices, recommendation systems, and streaming protocols.

published: reading time: 19 min read

System Design: Netflix Architecture for Global Streaming

Netflix serves over 250 million subscribers across 190 countries, streaming billions of hours of content monthly. The technical challenges span content delivery, real-time encoding, recommendation systems, and building a resilient microservices platform.

This case study examines Netflix’s architecture and how to design a streaming platform at scale.

Requirements Analysis

Functional Requirements

Users need to:

  • Browse and search a catalog of movies and TV shows
  • Stream video content on various devices
  • Create and manage profiles
  • Continue watching across devices
  • Rate and review content

Non-Functional Requirements

The platform must:

  • Stream in 4K HDR with surround sound
  • Start playback in under 2 seconds
  • Handle 15+ million concurrent streams
  • Maintain 99.99% availability
  • Work on 1000+ device types

Capacity Estimation

MetricValue
Subscribers250 million
Peak concurrent streams15+ million
Content library50,000+ titles
Average stream bitrate8 Mbps
Peak bandwidth120 Tbps
CDN edge locations100+

Content Delivery Architecture

Netflix’s content delivery is the core of their engineering. They built their own CDN called Open Connect.

Open Connect CDN

graph TB
    A[Netflix Origin] --> B[ISP Interconnects]
    B --> C[Open Connect Appliances]
    C --> D[Residential Routers]
    D --> E[Devices]

    subgraph "ISP Network"
        B
    end

    subgraph "Customer Premise"
        C
        D
    end

Open Connect appliances are custom-built servers deployed at ISP data centers worldwide. They cache popular content close to users.

Request Flow

sequenceDiagram
    participant D as Device
    participant OCA as Open Connect Appliance
    participant CS as Control Plane
    participant LS as License Server

    D->>CS: GET /manifest.m3u8
    CS->>D: Return manifest with URL to nearest OCA
    D->>OCA: GET /video/segment1.ts
    OCA-->>D: Video segment
    D->>LS: GET /license for DRM
    LS-->>D: License key
    D->>OCA: GET /video/segment2.ts
    OCA-->>D: Video segment

Video Encoding Pipeline

graph LR
    A[Source Video] --> B[Transcode Cluster]
    B --> C{HD or 4K?}
    C -->|4K| D[4K Encode Farm]
    C -->|HD| E[HD Encode Farm]
    D --> F[Output Profiles]
    E --> F
    F --> G[CDN Origins]
    G --> H[Edge Cache]

Netflix encodes each title in multiple resolutions and bitrates for adaptive streaming:

ProfileResolutionBitrateCodec
4K HDR3840x216016 MbpsH.265/VP9
1080p1920x10805 MbpsH.264/H.265
720p1280x7202.5 MbpsH.264
480p854x4801 MbpsH.264
AudioStereo/5.1/Atmos192-768 kbpsAAC

Microservices Architecture

Netflix decomposed their monolith into hundreds of microservices.

Service Decomposition

graph TB
    subgraph "Edge Layer"
        GW[API Gateway]
        GS[Gateway Service]
    end

    subgraph "Backend"
        M[Metadata Service]
        R[Recommendation Engine]
        P[Playback Service]
        S[Search Service]
        A[Auth Service]
    end

    subgraph "Data"
        EV[EVCache]
        DB[(Cassandra)]
        ES[Elasticsearch]
    end

    GW --> GS
    GS --> M
    GS --> R
    GS --> P
    GS --> S
    GS --> A
    M --> DB
    M --> EV
    R --> EV
    S --> ES

Key Microservices

ServiceResponsibilityData Store
API GatewayRequest routing, aggregationNone
MetadataTitles, episodes, imagesCassandra
PlaybackStreaming session managementEVCache
RecommendationsPersonalized suggestionsElasticsearch
SearchFull-text searchElasticsearch
User ProfileAccount, profiles, settingsCassandra
BillingSubscriptions, paymentsPostgreSQL

API Gateway

public class ApiGatewayApplication {
    public static void main(String[] args) {
        // Zuul routes configuration
        addRequestThreadFilters();
        addResponseFilters();

        // Route definitions
        configureRoutes(new ZuulRouteBuilder()
            .route("/api/v1/metadata/**", "metadata-service")
            .route("/api/v1/playback/**", "playback-service")
            .route("/api/v1/recommendations/**", "recommendation-service")
            .route("/api/v1/search/**", "search-service")
        );
    }
}

Recommendation System

Netflix’s recommendation engine drives 80% of content consumption.

Recommendation Pipeline

graph LR
    A[User Events] --> B[Event Pipeline]
    B --> C[Feature Store]
    C --> D{Ranking Models}
    D --> E[Personalized Ranking]
    D --> F[Similar Titles]
    D --> G[Top Picks]
    E --> H[API Response]
    F --> H
    G --> H

Ranking Model Features

class RankingFeatures:
    def __init__(self, user_id: int, title_id: int):
        self.user_features = self._get_user_features(user_id)
        self.title_features = self._get_title_features(title_id)
        self.context_features = self._get_context_features()

    def compute_features(self) -> Dict[str, float]:
        return {
            # User attributes
            "user_age_days": self.user_features.age_days,
            "user_avg_watch_time": self.user_features.avg_watch_time,
            "user_rating_avg": self.user_features.avg_rating,
            "user_genre_preferences": self.user_features.genre_scores,

            # Title attributes
            "title_popularity_score": self.title_features.popularity,
            "title_recency": self.title_features.release_days_ago,
            "title_rating": self.title_features.avg_rating,
            "title_match_score": self._genre_match(),

            # Context
            "time_of_day": self.context_features.hour,
            "day_of_week": self.context_features.day,
            "device_type": self.context_features.device
        }

Ranking Service

class RecommendationService:
    def __init__(self, model: RankingModel, cache: RedisCache):
        self.model = model
        self.cache = cache

    async def get_ranked_list(
        self,
        user_id: int,
        row_count: int = 20,
        evidence_count: int = 5
    ) -> List[RankedTitle]:
        # Check cache
        cache_key = f"recs:{user_id}:{row_count}"
        cached = await self.cache.get(cache_key)
        if cached:
            return self._deserialize(cached)

        # Get candidate titles
        candidates = await self._get_candidates(user_id, 500)

        # Compute features for each
        ranked = []
        for title in candidates:
            features = self._compute_features(user_id, title)
            score = await self.model.predict(features)
            ranked.append((score, title))

        # Sort and return top N
        ranked.sort(key=lambda x: x[0], reverse=True)
        results = [title for _, title in ranked[:row_count]]

        # Cache for 5 minutes
        await self.cache.setex(cache_key, 300, self._serialize(results))

        return results

Streaming Protocol

Netflix uses adaptive bitrate streaming for optimal viewing experience.

HLS/DASH Manifest

sequenceDiagram
    participant D as Device
    participant CDN as CDN
    participant L as License Server

    Note over D: Initial playback request

    D->>CDN: GET /title/1234/manifest.m3u8
    CDN-->>D: M3U8 with quality levels
    Note over D: Parse quality levels

    D->>CDN: GET /title/1234/video_4k.m3u8
    CDN-->>D: Segment list

    D->>CDN: GET /title/1234/video_4k/segment1.ts
    CDN-->>D: Video segment
    D->>L: GET /widevine/license
    L-->>D: DRM license

    Note over D: Decode and display

    D->>CDN: GET /title/1234/video_4k/segment2.ts
    CDN-->>D: Video segment

Adaptive Bitrate Logic

class AdaptiveBitrateController:
    def __init__(self, bandwidth_calculator: BandwidthCalculator):
        self.bandwidth_calculator = bandwidth_calculator
        self.current_quality = "auto"
        self.quality_levels = ["4k", "1080p", "720p", "480p", "360p"]

    def select_quality(self, buffer_level: float, throughput: float) -> str:
        # Rules-based adaptation
        if buffer_level < 10:  # Buffer running low
            return self._downgrade()
        elif buffer_level > 60:  # Buffer healthy
            return self._upgrade(throughput)
        else:
            return self.current_quality

    def _downgrade(self) -> str:
        current_idx = self.quality_levels.index(self.current_quality)
        if current_idx < len(self.quality_levels) - 1:
            return self.quality_levels[current_idx + 1]
        return self.current_quality

    def _upgrade(self, throughput: float) -> str:
        # Choose highest quality that fits throughput
        for quality in self.quality_levels:
            if self._required_throughput(quality) < throughput * 0.8:
                return quality
        return self.quality_levels[-1]

Data Storage

Cassandra for Metadata

CREATE TABLE titles (
    title_id UUID PRIMARY KEY,
    title_type TEXT,  -- 'movie' or 'show'
    title_name TEXT,
    synopsis TEXT,
    release_year INT,
    duration_secs INT,
    rating TEXT,
    genres LIST<TEXT>,
    -- Denormalized for query performance
    genres_sorted SET<TEXT>,
    created_at TIMESTAMP,
    updated_at TIMESTAMP
);

CREATE TABLE episodes (
    show_id UUID,
    season_num INT,
    episode_num INT,
    episode_id UUID,
    title_name TEXT,
    duration_secs INT,
    synopsis TEXT,
    PRIMARY KEY ((show_id), season_num, episode_num)
);

CREATE TABLE title_by_genre (
    genre TEXT,
    release_year INT,
    popularity_score DOUBLE,
    title_id UUID,
    PRIMARY KEY ((genre), popularity_score, release_year)
);

EVCache for Playback State

# Playback state cached in EVCache
PLAYBACK_STATE_TTL = 7200  # 2 hours

async def get_playback_state(user_id: int, title_id: int) -> PlaybackState:
    key = f"playback:{user_id}:{title_id}"
    cached = await evcache.get(key)

    if cached:
        return PlaybackState(**json.loads(cached))

    # Fetch from database
    state = await db.fetch_playback_state(user_id, title_id)

    if state:
        await evcache.setex(key, PLAYBACK_STATE_TTL, json.dumps(state))

    return state

async def save_playback_position(user_id: int, title_id: int, position: int):
    # Write to EVCache immediately
    key = f"playback:{user_id}:{title_id}"
    state = PlaybackState(user_id=user_id, title_id=title_id, position=position)

    await evcache.setex(key, PLAYBACK_STATE_TTL, json.dumps(state))

    # Persist to Cassandra async
    asyncio.create_task(
        db.save_playback_state(user_id, title_id, position)
    )

Global Architecture

Multi-Region Setup

graph TB
    subgraph "US-East (Primary)"
        ZU[Zuul Gateway]
        SVCS_E[Services]
        DB_E[(Cassandra)]
    end

    subgraph "EU-West (Replica)"
        ZU2[Zuul Gateway]
        SVCS_W[Services]
        DB_W[(Cassandra)]
    end

    subgraph "Asia-Pacific"
        ZU3[Zuul Gateway]
        SVCS_A[Services]
        DB_A[(Cassandra)]
    end

    SVCS_E <--> DB_E
    SVCS_W <--> DB_W
    SVCS_A <--> DB_A

    ZU --> SVCS_E
    ZU2 --> SVCS_W
    ZU3 --> SVCS_A

Traffic Routing

Netflix uses latency-based routing to direct users to the nearest region:

class LatencyRouter:
    def route_request(self, user_id: int, service: str) -> str:
        # Get user's last known region
        user_region = self._get_user_region(user_id)

        # Check if region is healthy
        if self._is_region_healthy(user_region):
            return f"{service}.{user_region}.netflix.com"

        # Fallback to lowest latency
        latencies = self._measure_all_regions(service)
        return min(latencies, key=latencies.get)

    def _measure_all_regions(self, service: str) -> Dict[str, float]:
        return {
            "us-east-1": self._ping(f"{service}.us-east-1.netflix.com"),
            "eu-west-1": self._ping(f"{service}.eu-west-1.netflix.com"),
            "ap-northeast-1": self._ping(f"{service}.ap-northeast-1.netflix.com")
        }

Resilience Patterns

Circuit Breaker

class CircuitBreaker:
    def __init__(self, failure_threshold: int = 5, timeout: int = 60):
        self.failure_threshold = failure_threshold
        self.timeout = timeout
        self.failure_count = 0
        self.last_failure_time = None
        self.state = "closed"

    async def call(self, func: Callable, *args, **kwargs):
        if self.state == "open":
            if time.time() - self.last_failure_time > self.timeout:
                self.state = "half-open"
            else:
                raise CircuitOpenException()

        try:
            result = await func(*args, **kwargs)
            self._on_success()
            return result
        except Exception as e:
            self._on_failure()
            raise e

    def _on_success(self):
        self.failure_count = 0
        self.state = "closed"

    def _on_failure(self):
        self.failure_count += 1
        self.last_failure_time = time.time()
        if self.failure_count >= self.failure_threshold:
            self.state = "open"

Bulkheads

Isolate service dependencies to prevent cascading failures:

class BulkheadExecutor:
    def __init__(self, max_concurrent: int = 100):
        self.semaphore = asyncio.Semaphore(max_concurrent)

    async def execute(self, func: Callable, *args, **kwargs):
        async with self.semaphore:
            return await func(*args, **kwargs)

# Different bulkheads for different service calls
metadata_bulkhead = BulkheadExecutor(max_concurrent=50)
recommendation_bulkhead = BulkheadExecutor(max_concurrent=20)
playback_bulkhead = BulkheadExecutor(max_concurrent=30)

API Design

Streaming Endpoints

EndpointDescription
GET /api/v1/browseGet personalized rows
GET /api/v1/titles/{id}/metadataGet title details
GET /api/v1/titles/{id}/similarsSimilar titles
GET /api/v1/playback/sessionInitialize playback
POST /api/v1/playback/positionUpdate position
GET /api/v1/search?q={query}Search titles

Response Example

{
  "data": {
    "id": "81239481",
    "title": "Stranger Things",
    "type": "show",
    "poster_url": "https://cdn.netflix.com/poster.jpg",
    "backdrop_url": "https://cdn.netflix.com/backdrop.jpg",
    "rating": "TV-14",
    "year": 2024,
    "duration": "4 seasons",
    "synopsis": "When a young boy vanishes...",
    "genres": ["Drama", "Horror", "Sci-Fi"],
    "seasons": [
      {
        "season_num": 1,
        "episodes": [
          {
            "num": 1,
            "title": "The Vanishing",
            "duration_secs": 3600,
            "thumbnail": "https://cdn.netflix.com/s1e1.jpg"
          }
        ]
      }
    ]
  },
  "meta": {
    "request_id": "abc123",
    "version": "2.1"
  }
}

Conclusion

Netflix’s architecture demonstrates how to build a globally distributed streaming platform:

  • Open Connect CDN places content at ISP facilities worldwide
  • Adaptive bitrate streaming optimizes quality for each connection
  • Microservices enable independent scaling and deployment
  • Recommendation algorithms drive content discovery
  • Multi-region deployment ensures global availability

DRM and Entitlement Systems

Digital Rights Management (DRM) protects content from unauthorized copying. Netflix uses multiple DRM schemes for different platforms.

DRM Architecture

graph TB
    D[Device] -->|1. Initialize| L[License Server]
    L -->|2. Device ID + Content Key Request| E[Entitlement Service]
    E -->|3. Check subscription| DB[(User DB)]
    DB -->|4. Entitlement OK| E
    E -->|5. Grant license| L
    L -->|6. Encrypted License| D
    D -->|7. Decrypt with device key| K[Content Key]
    K -->|8. Play| V[Video Decrypt]

Multi-DRM Support

class DRMManager:
    """Handle multiple DRM schemes per platform"""

    DRM_SCHEMES = {
        "widevine": "com.widevine.alpha",      # Android, Chrome, many devices
        "playready": "com.microsoft.playready", # Windows, Xbox
        "fairplay": "com.apple.fairplay",      # iOS, Safari
        "clearkey": "org.w3.clearkey"           # Web fallback
    }

    def get_supported_drm(self, device_type: str) -> List[str]:
        """Return DRM schemes supported by device"""
        capabilities = {
            "android": ["widevine"],
            "ios": ["fairplay"],
            "web_safari": ["fairplay", "clearkey"],
            "web_chrome": ["widevine", "clearkey"],
            "windows": ["playready", "widevine", "clearkey"],
            "smarttv": ["widevine", "playready"]
        }
        return capabilities.get(device_type, ["widevine"])

Entitlement Checking

class EntitlementService:
    """Verify user can access specific content"""

    async def check_entitlement(
        self,
        user_id: int,
        title_id: str,
        device_type: str
    ) -> EntitlementResult:
        # Get user's subscription tier
        subscription = await self.user_service.get_subscription(user_id)

        # Get title's required tier
        title = await self.metadata_service.get_title(title_id)

        if subscription.tier < title.required_tier:
            return EntitlementResult(
                allowed=False,
                reason="subscription_tier_too_low",
                upgrade_to=title.required_tier
            )

        # Check concurrent stream limit
        active_streams = await self.playback_service.count_active_streams(user_id)

        if active_streams >= subscription.max_streams:
            return EntitlementResult(
                allowed=False,
                reason="max_streams_exceeded",
                active_streams=active_streams
            )

        return EntitlementResult(allowed=True)

Multi-Device Session Management

Users watch Netflix on multiple devices. Sessions must sync playback position and handle concurrent playback limits.

Session State

class PlaybackSession:
    """Represent an active playback session"""

    def __init__(
        self,
        session_id: str,
        user_id: int,
        title_id: str,
        device_type: str,
        position_seconds: int,
        quality: str
    ):
        self.session_id = session_id
        self.user_id = user_id
        self.title_id = title_id
        self.device_type = device_type
        self.position_seconds = position_seconds
        self.quality = quality
        self.started_at = datetime.utcnow()
        self.last_heartbeat = datetime.utcnow()

    async def update_position(self, position_seconds: int):
        """Update playback position"""
        self.position_seconds = position_seconds
        self.last_heartbeat = datetime.utcnow()

        # Persist to storage
        await self.playback_store.save_position(
            self.session_id,
            position_seconds
        )

        # Invalidate device position cache
        await self.cache.delete(f"position:{self.user_id}:{self.title_id}")

Continue Watching Sync

class ContinueWatchingService:
    """Sync playback position across devices"""

    async def get_resume_position(
        self,
        user_id: int,
        title_id: str,
        requesting_device: str
    ) -> ResumePosition:
        # Check if another device has more recent position
        active_sessions = await self.session_manager.get_active_sessions(
            user_id,
            exclude_device=requesting_device
        )

        # Find session with this title
        for session in active_sessions:
            if session.title_id == title_id:
                return ResumePosition(
                    position_seconds=session.position_seconds,
                    device=session.device_type,
                    updated_at=session.last_heartbeat
                )

        # Fallback to database
        return await self.playback_store.get_position(user_id, title_id)

    async def handle_playback_start(
        self,
        user_id: int,
        title_id: str,
        device_type: str
    ) -> Session:
        # Create new session
        session = PlaybackSession(
            session_id=uuid4(),
            user_id=user_id,
            title_id=title_id,
            device_type=device_type,
            position_seconds=0,
            quality="auto"
        )

        # Register session
        await self.session_manager.register(session)

        # Enforce concurrent stream limit
        await self._enforce_stream_limit(user_id)

        return session

    async def _enforce_stream_limit(self, user_id: int):
        """Ensure user hasn't exceeded stream limit"""
        subscription = await self.user_service.get_subscription(user_id)
        active = await self.session_manager.count_active(user_id)

        if active > subscription.max_streams:
            # Force oldest session to stop
            oldest = await self.session_manager.get_oldest_session(user_id)
            await self.session_manager.terminate(oldest.session_id)

Adaptive Bitrate Streaming Deep Dive

ABR Algorithm Details

class AdaptiveBitrateController:
    """Netflix's adaptive bitrate selection algorithm"""

    def __init__(self):
        # Quality levels (lowest to highest)
        self.levels = [
            {"name": "auto", "min_bandwidth": 0},
            {"name": "360p", "min_bandwidth": 0.7, "max_bandwidth": 1.5},
            {"name": "480p", "min_bandwidth": 1.5, "max_bandwidth": 3},
            {"name": "720p", "min_bandwidth": 3, "max_bandwidth": 5},
            {"name": "1080p", "min_bandwidth": 5, "max_bandwidth": 10},
            {"name": "4K", "min_bandwidth": 10, "max_bandwidth": float('inf')}
        ]

        # State
        self.current_level = 0
        self.buffer_levels = []  # Ring buffer of recent buffer levels
        self.throughput_samples = []  # Ring buffer of recent throughput

    def calculate_throughput(self, segments: List[Segment]) -> float:
        """Calculate effective throughput from recent segments"""
        if not segments:
            return 0

        # Weight recent samples higher
        weights = [0.1, 0.15, 0.25, 0.5]  # Oldest to newest
        weighted_sum = sum(
            s.download_time / s.size_mb * w
            for s, w in zip(segments[-4:], weights)
        )
        return weighted_sum

    def select_quality(self, throughput: float, buffer_level: float) -> str:
        """Select optimal quality based on conditions"""
        # Determine if we're in startup, steady, or buffer-depleted state
        state = self._classify_state(buffer_level)

        if state == "startup":
            # During startup, download multiple qualities in parallel
            return self.levels[0]["name"]  # "auto"

        elif state == "buffer_depleted":
            # Buffer running low - switch to lower quality
            return self._select_lower_quality(throughput)

        elif state == "steady":
            # Try to improve quality if buffer is healthy
            return self._select_quality_for_throughput(throughput)

        return self.current_level

    def _classify_state(self, buffer_level: float) -> str:
        if buffer_level < 10:
            return "buffer_depleted"
        elif buffer_level < 60:
            return "steady"
        else:
            return "startup"  # Large buffer, can experiment

CDN Cache Invalidation

graph TB
    A[Content Update] --> B{New encode or metadata change?}
    B -->|Metadata| C[Update metadata service]
    B -->|New encode| D[Push to origin]
    D --> E[Invalidate edge caches]
    E --> F[CDN propagates in < 30s]

    subgraph "Cache Invalidation Strategy"
        C
        E
    end
class CDNInvalidationService:
    """Handle content cache invalidation across CDN"""

    async def invalidate_title(self, title_id: str, reason: str):
        """Invalidate all cached data for a title"""
        # Invalidate manifest files
        await self.cdn.invalidate(f"/title/{title_id}/*.m3u8")

        # Invalidate metadata
        await self.cdn.invalidate(f"/api/metadata/{title_id}")

        # Invalidate thumbnails
        await self.cdn.invalidate(f"/images/{title_id}/*")

        # Log invalidation for audit
        await self.audit_log.record({
            "event": "cache_invalidation",
            "title_id": title_id,
            "reason": reason,
            "timestamp": datetime.utcnow()
        })

Production Failure Scenarios

Failure ScenarioImpactMitigation
CDN origin failureVideo segments unavailableMulti-CDN; fallback to direct streaming
License server downNo new streams can startCache licenses; graceful degradation
Recommendation service slowHomepage takes longer to loadCache recommendations; show stale content
Encoding pipeline backlogNew content delayedPriority encoding; capacity headroom
Device too many streamsNew streams rejectedClear error message; upgrade prompt

Observability Checklist

Metrics to Capture

  • stream_startup_time_seconds (histogram) - Time to first frame
  • bitrate_selected (histogram) - Quality distribution
  • rebuffer_ratio (gauge) - Time spent rebuffering vs playing
  • cdn_cache_hit_ratio (gauge) - Cache efficiency
  • license_request_latency_ms (histogram) - DRM overhead
  • concurrent_streams (gauge) - Active stream count

Alerts to Configure

AlertThresholdSeverity
Startup time P99 > 3s3000msWarning
Rebuffer ratio > 5%5%Warning
CDN cache hit < 90%90%Warning
License latency P99 > 200ms200msCritical
Active streams < expected< 50% baselineWarning

Security Checklist

  • DRM encryption for all premium content
  • Device attestation before issuing licenses
  • HDCP enforcement for high-definition outputs
  • Screen capture detection and blocking
  • Concurrent stream enforcement
  • Geographic restrictions per title
  • Secure token exchange for session management
  • Content signing to prevent tampering

Common Pitfalls / Anti-Patterns

Pitfall 1: Aggressive Quality Switching

Problem: Switching quality too frequently creates a “flutter” effect that’s visually jarring.

Solution: Implement quality stability windows. Once you switch up or down, stay at that level for at least 30 seconds.

Pitfall 2: Ignoring Network Variability

Problem: Using average throughput misses spikes and drops.

Solution: Use weighted average that emphasizes recent samples. Build in safety margins (use 70-80% of measured throughput for decisions).

Pitfall 3: Not Testing on Real Networks

Problem: Lab testing does not capture real-world variability (WiFi interference, cellular handoffs).

Solution: A/B test ABR algorithms on real users. Monitor quality distributions and rebuffer ratios in production.


Interview Q&A

Q: Why does Netflix build its own CDN (Open Connect) instead of using existing CDNs?

A: Netflix streams billions of hours monthly. At that scale, commercial CDN costs become prohibitive. Open Connect appliances are custom-built for Netflix’s workload (video streaming, not general web content). They are deployed at ISP facilities worldwide, placing content close to users while reducing Netflix’s backbone costs. The economics only work at Netflix’s scale.

Q: How does adaptive bitrate streaming work?

A: Netflix encodes each title in multiple quality levels (4K, 1080p, 720p, etc.). The device downloads an HLS/DASH manifest listing available quality levels. The client measures download speed and buffer fullness. If buffer runs low, it switches to lower quality. If buffer is healthy and bandwidth is high, it switches up. This happens every few seconds during playback.

Q: How does Netflix enforce concurrent stream limits?

A: The entitlement service tracks active playback sessions per user. When a user starts playback, it counts existing sessions. If the count exceeds the subscription tier limit (e.g., 4 streams for premium), the oldest session is terminated. The device receives an error with an upgrade prompt.

Q: What is the difference between Widevine, PlayReady, and FairPlay?

A: These are DRM schemes for different platforms. Widevine (Google) runs on Android, Chrome, most smart TVs. PlayReady (Microsoft) runs on Windows, Xbox, some smart TVs. FairPlay (Apple) runs on iOS, Safari, Apple TV. Netflix negotiates with the device to select the highest security level the device supports. Content is encrypted once and licenses are delivered via the device’s preferred DRM.


Scenario Drills

Scenario 1: CDN Origin Server Failure

Situation: The CDN origin serving video segments goes down during peak viewing hours.

Analysis:

  • Devices requesting segments get errors
  • Playback stalls, rebuffering begins
  • Millions of concurrent streams affected

Solution: Multi-CDN deployment with automatic failover. If one CDN has issues, traffic routes to another. Open Connect has multiple origin clusters geographically distributed. Devices can switch CDN transparently if segment requests fail.

Scenario 2: New Show Releases Simultaneously Worldwide

Situation: A highly anticipated show releases globally at midnight UTC. 10 million users try to start playback simultaneously.

Analysis:

  • Encoding pipeline must complete all quality levels before release
  • CDN edge caches start cold for a new title
  • License servers receive burst of requests

Solution: Pre-encode content days before release. Pre-position popular titles at ISP locations. License server scales horizontally; licenses are cached for their validity period to reduce server load.

Scenario 3: ABR Algorithm Causes Quality Flutter

Situation: Users report constantly changing video quality, creating a jarring viewing experience.

Analysis:

  • ABR switches quality too frequently
  • Bandwidth measurements fluctuate (wireless networks)
  • Buffer thresholds trigger rapid up/down switching

Solution: Implement stability windows. Once you switch to a quality level, stay there for at least 30 seconds. Use weighted throughput averages that emphasize recent samples. Build in safety margins (use 70% of measured throughput for decisions).


Failure Flow Diagrams

Stream Playback Initialization

graph TD
    A[User Clicks Play] --> B[Get Manifest from CDN]
    B --> C[Parse Quality Levels]
    C --> D[Select Initial Quality]
    D --> E[Request Video Segment]
    E --> F{Segment Available?}
    F -->|No| G[Try Alternative CDN]
    F -->|Yes| H[Download Segment]
    G --> E
    H --> I[Request DRM License]
    I --> J[License Server]
    J --> K{Check Entitlement?}
    K -->|No| L[Return Error]
    K -->|Yes| M[Return License Key]
    M --> N[Decrypt Segment]
    N --> O[Decode and Display]
    O --> P[Buffer Next Segment]
    P --> E

ABR Quality Selection

graph TD
    A[Monitor Buffer Level] --> B{Buffer < 10s?}
    B -->|Yes| C[Select Lower Quality]
    B -->|No| D{Buffer > 60s?}
    D -->|Yes| E[Select Higher Quality]
    D -->|No| F[Maintain Current Quality]
    C --> G[Download at New Quality]
    E --> G
    F --> H[Continue Current Quality]
    G --> H
    H --> A

Multi-CDN Failover

graph TD
    A[Request Segment] --> B[Primary CDN]
    B --> C{Response OK?}
    C -->|Yes| D[Deliver to Device]
    C -->|No| E[Try Secondary CDN]
    E --> F{Response OK?}
    F -->|Yes| G[Deliver to Device]
    F -->|No| H[Fallback to Origin Direct]
    H --> I{Origin Available?}
    I -->|Yes| J[Deliver to Device]
    I -->|No| K[Show Error]
    D --> L[Report CDN Health]
    G --> L

Quick Recap

  • Netflix’s Open Connect CDN places servers at ISP locations worldwide.
  • Adaptive bitrate streaming selects quality based on bandwidth and buffer health.
  • DRM (Widevine, PlayReady, FairPlay) protects content on each platform.
  • Entitlement service enforces subscription tier and concurrent stream limits.
  • Session sync allows “continue watching” across devices.

Copy/Paste Checklist

- [ ] Implement ABR with buffer-aware quality selection
- [ ] Use multi-CDN with failover
- [ ] Cache CDN responses aggressively
- [ ] Enforce concurrent stream limits per subscription
- [ ] Implement session sync for continue watching
- [ ] Monitor stream startup time and rebuffer ratio
- [ ] Test ABR on real networks, not just labs

For more on CDN design, see our CDN Deep Dive guide. For database strategies, see NoSQL Databases. For caching patterns, see Distributed Caching.

Category

Related Posts

System Design: Twitter Feed Architecture and Scalability

Deep dive into Twitter system design. Learn about feed generation, fan-out, timeline computation, search, notifications, and scaling challenges.

#system-design #case-study #twitter

System Design: URL Shortener from Scratch

Deep dive into URL shortener architecture. Learn hash function design, redirect logic, data storage, rate limiting, and high-availability.

#system-design #case-study #url-shortener

Amazon's Architecture: Lessons from the Pioneer of Microservices

Learn how Amazon pioneered service-oriented architecture, the famous 'two-pizza team' rule, and how they built the foundation for AWS.

#microservices #amazon #architecture