CDN Deep Dive: Content Delivery Networks and Edge Computing

Understand CDNs from PoPs and anycast routing to cache headers and edge computing. Configure CloudFlare for performance.

published: reading time: 17 min read

CDN Deep Dive

A CDN sits between your server and your users. It caches your content at edge locations worldwide so users fetch from a server down the street instead of halfway across the world. The result: faster page loads, less bandwidth cost, and protection against traffic spikes.

But CDNs are more than just static file caches. Modern CDNs do request routing, DDoS protection, image optimization, edge computing, and more.


How CDNs Work

The Basic Idea

graph LR
    A[User in Tokyo] --> B[CDN Edge<br/>Tokyo]
    A -->|Without CDN| C[Origin Server<br/>Virginia, USA]

    B --> D[Cache HIT<br/>Fast response]
    C --> E[Cache MISS<br/>Slow response]

Without CDN: User in Tokyo fetches from Virginia. Round-trip time: 200ms+.

With CDN: User in Tokyo fetches from Tokyo edge. Round-trip time: 5ms.

Points of Presence (PoPs)

CDNs maintain a network of servers called Points of Presence (PoPs). Each PoP has edge servers that serve cached content and handle requests.

A large CDN like CloudFlare or Fastly has hundreds of PoPs globally. When a user makes a request, the CDN routes it to the nearest PoP using anycast routing.

graph TD
    A[User Request] --> B[DNS Resolution]
    B --> C[Anycast Routing]
    C --> D[Nearest PoP]
    D --> E{Content Cached?}
    E -->|Yes| F[Return from cache]
    E -->|No| G[Fetch from origin]
    G --> H[Cache at PoP]
    H --> F

Anycast Routing

CDNs use Anycast for routing. Multiple PoPs announce the same IP address. Traffic routes to the geographically closest PoP automatically.

# How anycast works (simplified)
# All PoPs announce: 104.16.100.1 is here
# User's router sees multiple paths, picks shortest

# BGP路由
# Tokyo PoP: 104.16.100.1 via AS12345
# Virginia PoP: 104.16.100.1 via AS12345
# User in Tokyo gets routed to Tokyo PoP

Cache Headers

CDNs respect HTTP cache headers. Getting these right is essential for CDN performance.

Cache-Control

# Don't cache at all (private content)
Cache-Control: private, no-store

# Cache everywhere for 1 hour
Cache-Control: public, max-age=3600

# Cache at edge only (not in browsers), for 1 day
Cache-Control: public, s-maxage=86400, max-age=0

# Stale-while-revalidate: serve stale while fetching update
Cache-Control: public, max-age=3600, stale-while-revalidate=60

# Immutable: content never changes (perfect for versioned assets)
Cache-Control: public, max-age=31536000, immutable

When to Use What

HeaderUse Case
private, no-storeUser-specific data, credentials, payment info
public, max-age=3600API responses that change hourly
public, s-maxage=86400Static content, cached at edge only
immutableVersioned assets (JS bundles, images with hashes)

Vary Header

Tells CDN that responses vary based on certain request headers.

# Cache different versions based on Accept-Encoding
Vary: Accept-Encoding

# Cache different versions based on Authorization
Vary: Authorization

# Common combination
Vary: Accept-Encoding, Accept-Language

Warning: Every Vary header value creates a separate cache entry. Too many variations floods your CDN cache.


Cache Invalidation

Sometimes you need to force CDN to discard cached content.

Purge Methods

# CloudFlare API purge
curl -X POST "https://api.cloudflare.com/client/v4/zones/$ZONE/purge_cache" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  --data '{"files":["https://example.com/style.css"]}'

# Purge everything
curl -X POST "https://api.cloudflare.com/client/v4/zones/$ZONE/purge_cache" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  --data '{"purge_everything": true}'

Cache Tags (Surrogate Keys)

Some CDNs support cache tagging for granular invalidation.

# Set cache tag on response
Cache-Tag: product, product-123, category-electronics

# Purge by tag
curl -X POST "https://api.cloudflare.com/client/v4/zones/$ZONE/purge_cache" \
  -H "Authorization: Bearer $TOKEN" \
  --data '{"tags":["product-123"]}'

This lets you invalidate all pages containing product 123 without purging the entire cache.


CDN Configuration for Static Sites

Typical Configuration

# HTML: short cache, revalidate
Cache-Control: public, max-age=60, stale-while-revalidate=300

# Static assets with hashes (immutable)
Cache-Control: public, max-age=31536000, immutable

# Images: medium cache
Cache-Control: public, max-age=86400, stale-while-revalidate=3600

Pretty URLs vs File Paths

Static site generators produce file paths like /blog/post/index.html. This makes cache invalidation tricky.

Better: use origin pull CDN where the CDN fetches from your origin on cache miss and caches the result. Your HTML points to the CDN URLs.

<!-- Instead of -->
<img src="/assets/image.png" />

<!-- Use CDN URL -->
<img src="https://cdn.example.com/assets/image.png" />

CloudFlare Configuration

CloudFlare is one of the most popular CDNs. Here’s how to configure it.

Page Rules

Page Rules control caching behavior per URL pattern.

# Rule 1: Cache everything static
# Pattern: *example.com/static/*
# Settings:
#   - Cache Level: Cache Everything
#   - Edge Cache TTL: 1 month
#   - Browser Cache TTL: 1 year

# Rule 2: Don't cache API
# Pattern: *example.com/api/*
# Settings:
#   - Cache Level: Bypass

# Rule 3: HTML cache with auto-revalidate
# Pattern: *example.com/blog/*
# Settings:
#   - Cache Level: Standard
#   - Edge Cache TTL: 1 hour
#   - Browser Cache TTL: 30 minutes

Caching Level

# Basic (standard caching, respects headers)
# Cache everything (caches all, ignores query strings mostly)
# Bypass (never cache)

Edge Caching

CloudFlare’s Argo tier caches at the edge beyond just PoP caching.

# Enable Argo
# Settings -> Network -> Always Use NOW

# This improves cache hit rate for non-HTML content
# by routing through optimized edge network

Edge Computing

Modern CDNs support edge computing: running code at edge locations instead of just caching.

Cloudflare Workers

// Cloudflare Worker: runs at edge
addEventListener("fetch", (event) => {
  event.respondWith(handleRequest(event.request));
});

async function handleRequest(request) {
  const url = new URL(request.url);

  // Rewrite requests to API
  if (url.pathname.startsWith("/api/")) {
    // Add CORS headers
    const response = await fetch(request);
    const headers = new Headers(response.headers);
    headers.set("Access-Control-Allow-Origin", "*");
    return new Response(response.body, {
      status: response.status,
      headers: headers,
    });
  }

  // Pass through to origin
  return fetch(request);
}

Edge Use Cases

Use CaseDescription
A/B testingRoute users to variants at edge
GeolocationServe different content based on user location
AuthenticationValidate tokens before hitting origin
Rate limitingLimit requests at edge, before they reach origin
PersonalizationModify content per user at edge

Performance Impact

Here’s what a CDN typically does for page load times:

MetricWithout CDNWith CDN
TTFB200-500ms5-50ms
Page Load3-5s1-2s
Bandwidth Cost$1.00/GB$0.08/GB

TTFB (Time To First Byte) drops dramatically because CDN answers from memory at the edge.


Common CDN Mistakes

1. Caching Private Content

Never cache user-specific content. Use private or no-store.

# WRONG: caches user data
Cache-Control: public

# RIGHT: don't cache private content
Cache-Control: private, no-store

2. Ignoring Query Strings

CDNs treat /page?v=1 and /page?v=2 as different URLs by default. If your framework generates unique query strings on every build, you need Cache-Busting.

# Instead of ?v= query strings
# Use content-hashed filenames
# /app.a3f5d8.js (hash in filename)
/blog/post (no query string)

3. Over-Caching HTML

HTML changes frequently. Don’t set long cache times on HTML unless you have a solid invalidation strategy.

# Conservative HTML caching
Cache-Control: public, max-age=60, stale-while-revalidate=300

4. Not Using immutable

Versioned assets (hashed filenames) can be cached forever.

# For hashed assets: 1 year cache, immutable
Cache-Control: public, max-age=31536000, immutable

When to Use / When Not to Use

Use CaseWhen to Use CDNWhen Not to Use CDN
Static AssetsVersioned JS/CSS bundles, images, fontsRarely updated content needing instant purging
HTML PagesMostly static blog/documentation sitesFrequently updated real-time dashboards
API ResponsesPublic APIs with identical responses for all usersUser-specific, authenticated, or dynamic content
Video/StreamingVOD, large file distributionLive streaming (use specialized streaming CDN)
Global User BaseUsers distributed across geographiesLocalized single-region user base
Traffic SpikesExpecting sudden popularity surgesPredictable steady traffic patterns

When CDN is Essential

  • Public website serving users globally
  • Static-first architecture with versioned assets
  • Cost optimization for bandwidth-heavy content
  • DDoS protection and security layer needed
  • SEO improvement via fast global load times

When CDN is Overkill

  • Fully dynamic, personalized content on every request
  • Internal/private applications behind VPN
  • Real-time data (stock prices, live sports)
  • Content that changes constantly (search results)
  • Applications with very small, localized user bases

Production Failure Scenarios

FailureImpactMitigation
Cache doesn’t purgeUsers see stale content after updateUse cache tags; implement versioning; purge on deploy
CDN origin pull stormCache miss wave hits originImplement origin shield; gradual cache warming; rate limiting
SSL certificate expiryCDN serves no content over HTTPSAutomated cert renewal (Let’s Encrypt); monitoring
CDN goes down globallyAll traffic fails or falls back to slow originConfigure origin fallback; multi-CDN strategy
Regional PoP failureUsers in region experience latency/outageCDN redundancy; anycast routing
Cache poisoningMalicious content cached and servedValidate origin responses; signed URLs; integrity checks
Header misconfigurationPrivate content cached publiclyAudit cache headers; use private for user data
Query string abuseDifferent cache entry per query paramIgnore or normalize query strings; use cache busting

Capacity Estimation: Edge Cache Sizing, PoP Count, and Bandwidth Planning

Sizing a CDN deployment requires understanding how much cache lives at the edge and how much origin bandwidth you actually need.

Edge cache sizing formula:

edge_cache_per_PoP = disk_size_per_edge_node × number_of_edge_nodes_per_PoP
total_edge_capacity = edge_cache_per_PoP × number_of_PoPs
effective_cache = total_edge_capacity × cache_hit_ratio

For a CDN with 100 PoPs, each having 10 edge nodes with 1TB SSD each:

  • Per PoP: 10TB
  • Total: 1PB raw capacity
  • At 90% hit rate: effective = 100TB served from cache, 10TB from origin

Origin bandwidth planning:

origin_bandwidth = total_traffic × (1 - cache_hit_ratio) × avg_response_size
peak_origin_bandwidth = origin_bandwidth × peak_factor

If your site serves 10Gbps total traffic with 80% CDN hit rate, 2Gbps hits origin. With a 3× peak factor, origin must handle 6Gbps. For a typical origin server with 10GbE NIC, this is manageable. For 50Gbps total with 70% hit rate, origin needs 15Gbps at peak — requiring load balancers and multiple origin servers.

PoP count planning: The formula for PoP coverage:

latency_to_user ≈ distance_to_nearest_PoP
number_of_PoPs_needed = geographic_coverage_requirement / avg_PoP_radius

Cloudflare has 300+ PoPs globally, Fastly has 80+. For most companies, using an existing CDN means you inherit their PoP count. If you are evaluating multi-CDN, measure your user geographic distribution and match to CDN PoP coverage in those regions.

Cache hit rate estimation: Theoretical maximum hit rate based on content characteristics:

hit_rate ≈ 1 - (unique_objects_per_day / total_requests_per_day × avg_object_size / cache_size_per_PoP)

Static sites with 1000 unique objects and 1M daily requests have high hit rate potential. Dynamic APIs with millions of unique query strings per day will always have low hit rate regardless of cache size.

Real-World Case Study: Cloudflare Outage 2022

On June 21, 2022, Cloudflare suffered a global outage affecting 19 of their core data centers. The root cause was a bug in a deployment that included a rule to reject traffic from specific IP ranges — but the rule was applied incorrectly, causing all traffic at affected PoPs to be rejected rather than just the targeted IPs.

The impact: sites using Cloudflare saw HTTP 522 (Connection Timed Out) errors. Cloudflare’s own status page went down. The outage lasted approximately 6 hours globally.

The lesson for CDN-dependent infrastructure: CDN is a single point of failure even when the CDN promises 100% uptime. The correct mitigations:

  1. Multi-CDN strategy: Route different percentages of traffic to different CDNs. If one fails, you fail over to the other. This adds complexity but eliminates single CDN dependency.

  2. Origin fallback: Configure your origin as a direct fallback. During a CDN outage, some users will get slower responses from origin, but the site stays up.

  3. Health checks and automatic failover: Use a load balancer or DNS-based failover that detects CDN unavailability and routes traffic elsewhere. The Cloudflare status page going down was itself a cascading failure — their internal monitoring relied on the same infrastructure they were trying to monitor.

The irony: the outage was caused by a change intended to protect against a specific threat. The change was misconfigured. This is a reminder that deploys to critical infrastructure deserve extra scrutiny, staged rollouts, and the ability to roll back immediately.


Observability Checklist

Metrics to Track

CDN Provider Metrics:

  • cdn.bandwidth_bytes - Total bandwidth served
  • cdn.cache_hit_ratio - Percentage served from cache vs origin
  • cdn.requests_total - Total requests
  • cdn.latency_p95_p99 - Edge response times
  • cdn.origin_fetch_time - Time spent fetching from origin when cache miss

Application Metrics:

# Check cache headers on responses
curl -I https://example.com/assets/app.js

# Look for:
# X-Cache: HIT/MISS/EXPIRED
# CF-Cache-Status: HIT/MISS/EXPIRED/REVALIDATED
# Age: seconds since cached at edge

Logs to Capture

# Log CDN cache status for debugging
import structlog

logger = structlog.get_logger()

def log_cdn_response(response, url):
    headers = dict(response.headers)

    cache_status = headers.get('X-Cache', headers.get('CF-Cache-Status', 'Unknown'))
    age = headers.get('Age', '0')

    logger.info("cdn_response",
        url=url,
        cache_status=cache_status,
        age_seconds=age,
        content_type=headers.get('Content-Type'),
        content_length=headers.get('Content-Length'))

# Cloudflare Analytics API
# curl "https://api.cloudflare.com/client/v4/zones/$ZONE/analytics/dashboard" \
#   -H "Authorization: Bearer $TOKEN"

Alert Rules

# CDN-specific alerts
- alert: CDNCacheHitRateLow
  expr: cdn_cache_hits_total / (cdn_cache_hits_total + cdn_origin_fetches_total) < 0.7
  for: 15m
  labels:
    severity: warning
  annotations:
    summary: "CDN cache hit rate below 70%"

- alert: CDNOriginLatencyHigh
  expr: histogram_quantile(0.95, cdn_origin_fetch_duration_seconds) > 2
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: "CDN origin fetch latency above 2 seconds"

- alert: CDNBandwidthAnomaly
  expr: rate(cdn_bandwidth_bytes_total[5m]) > 1.1 * avg_over_time(rate(cdn_bandwidth_bytes_total[1h])[7d:1h])
  for: 10m
  labels:
    severity: warning
  annotations:
    summary: "CDN bandwidth significantly above normal"

Security Checklist

  • Set appropriate Cache-Control headers - Never cache private/personal content
  • Use private for user-specific data - User profiles, auth tokens, payment info
  • Implement signed URLs - For premium content that shouldn’t be shared
  • Validate Origin headers - Prevent host header injection attacks
  • Enable DDoS protection - Most CDNs provide this; ensure it’s configured
  • Monitor for cache poisoning - Unexpected content in cache
  • Use WAF rules - Block malicious requests at CDN edge
  • Certificate management - Automated renewals; prevent expiry
  • Purge credentials on compromise - If API keys leaked, rotate immediately
  • Respect Vary headers - Cache different versions appropriately
# Security headers via CDN
# Cloudflare Workers example
response.headers.set('X-Content-Type-Options', 'nosniff')
response.headers.set('X-Frame-Options', 'DENY')
response.headers.set('X-XSS-Protection', '1; mode=block')
response.headers.set('Referrer-Policy', 'strict-origin-when-cross-origin')

# Signed URL for premium content
# Cloudflare Stream: signed tokens for time-limited access
# https://example.com/video.mp4?exp=1699999999&token=abc123

Common Pitfalls / Anti-Patterns

1. Caching Personalized Content

Don’t cache content that varies per user.

# BAD: Personalized page cached publicly
Cache-Control: public, max-age=3600
# User sees another user's data!

# GOOD: User-specific pages should not be cached
Cache-Control: private, no-store
# Or don't cache at CDN at all for auth-required pages

2. Query String Cache Key Duplication

CDNs treat /page?session=123 and /page?session=456 as different URLs.

# BAD: Unique query string on every build
# /app.js?v=1.0.0.12345  (changes every build)
# /app.js?v=1.0.0.12346  (new cache entry each time!)

# GOOD: Content-hashed filenames (cache forever)
# /app.a3f5d8.js  (hash in filename)
# Cache-Control: public, max-age=31536000, immutable

3. Setting Long TTLs Without Invalidation Plan

Long cache + no purge = stale content.

# BAD: Long TTL, no invalidation strategy
Cache-Control: public, max-age=31536000
# Content updated but CDN serves old version for a year

# GOOD: Reasonable TTL with purging
Cache-Control: public, max-age=86400  # 1 day
# On deploy: Purge old version via API

4. Over-Caching HTML

HTML changes often. Don’t cache it long unless you’re certain.

# BAD: Long TTL on HTML
Cache-Control: public, max-age=31536000
# Homepage updated, users see old version for a year

# GOOD: Short TTL with revalidation
Cache-Control: public, max-age=60, stale-while-revalidate=300
# Serve stale briefly while fetching update

5. Not Using Edge Computing Wisely

Edge functions add complexity; don’t overuse them.

// BAD: Heavy computation at edge
addEventListener("fetch", (event) => {
  event.respondWith(handleRequest(event.request));
});

async function handleRequest(request) {
  // This runs on EVERY request - expensive!
  const result = await doHeavyComputation(request);
  return new Response(result);
}

// GOOD: Simple routing, lightweight transforms
async function handleRequest(request) {
  const url = new URL(request.url);

  // Lightweight: Add headers, modify path
  // Heavy: Pass through to origin
  if (url.pathname.startsWith("/api/")) {
    return fetch(request); // Pass through
  }

  // Simple edge logic for static content
  const response = await fetch(request);
  const headers = new Headers(response.headers);
  headers.set("X-Edge-Location", "Tokyo");
  return new Response(response.body, { headers });
}

6. Missing Origin Shield

Without origin shield, every cache miss hits your origin directly.

graph TD
    A[User] --> B[CDN Edge]
    B -->|miss| C[Origin Shield]
    C -->|miss| D[Origin Server]

    E[User] --> F[CDN Edge]
    F -->|miss| G[Origin Server]

    C --> H[Cache warm]
    D --> C
# Enable origin shield in Cloudflare
# Network tab -> Always Use NOW (Argo)
# Or configure secondary origin for origin shield

# This reduces origin load by caching at intermediate tier

Quick Recap

Key Bullets

  • CDNs reduce latency by serving from geographically close edge locations
  • Cache-Control headers control what and how long CDN caches content
  • Static assets with content hashes should use immutable for maximum caching
  • HTML should have short TTL with stale-while-revalidate for smoothness
  • Always use private for user-specific, authenticated, or sensitive content
  • Cache invalidation (purge) is the hardest problem; design around it
  • Edge computing adds capability but adds complexity - use sparingly
  • Origin shield prevents cache stampede from overwhelming your origin

Copy/Paste Checklist

# CDN header checklist for static site
# HTML (short cache, revalidate)
Cache-Control: public, max-age=60, stale-while-revalidate=300

# Versioned static assets (immutable)
Cache-Control: public, max-age=31536000, immutable

# Images (medium cache)
Cache-Control: public, max-age=86400, stale-while-revalidate=3600

# API responses (don't cache or very short)
Cache-Control: private, no-store
# OR for public APIs
Cache-Control: public, max-age=60

# User-specific content (never CDN cache)
Cache-Control: private, no-store

# Cloudflare purge all
curl -X POST "https://api.cloudflare.com/client/v4/zones/$ZONE/purge_cache" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  --data '{"purge_everything": true}'

# Cloudflare purge by tag
curl -X POST "https://api.cloudflare.com/client/v4/zones/$ZONE/purge_cache" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  --data '{"tags":["product-123"]}'

# Deployment checklist:
# [ ] Set up cache headers before going live
# [ ] Test cache behavior with curl -I
# [ ] Implement versioning/hashing for static assets
# [ ] Plan cache purge strategy for deployments
# [ ] Enable origin shield to protect origin
# [ ] Set up monitoring for cache hit rate
# [ ] Configure alerts for origin fetch latency
# [ ] Review security headers (X-Frame-Options, CSP, etc.)

See Also


Conclusion

CDNs are essential infrastructure for any public website. They reduce latency, cut costs, and add a layer of protection. But they require careful configuration. Wrong cache headers can leak private data or serve stale content.

Start with sensible defaults: short cache on HTML, long cache on static assets with content hashes. Add edge computing only when you have clear performance gains to show for it.

The CDN is not magic. It’s a cache with good network placement.

Category

Related Posts

Cache Eviction Policies: LRU, LFU, FIFO, and More Explained

Learn LRU, LFU, FIFO, and TTL eviction policies. Understand trade-offs with real-world performance implications for caching.

#caching #algorithms #system-design

Cache Patterns: Stampede, Thundering Herd, Tiered Caching

Learn advanced cache patterns for production systems. Solve cache stampede, implement cache warming, and design tiered caching architectures.

#caching #patterns #system-design

Caching Strategies: Cache-Aside, Write-Through, and More

Master five caching strategies for production systems. Learn cache-aside vs write-through, avoid cache stampede, and scale with these patterns.

#caching #system-design #performance