System Design: URL Shortener from Scratch

Deep dive into URL shortener architecture. Learn hash function design, redirect logic, data storage, rate limiting, and high-availability.

published: March 22, 2026 reading time: 35 min read author: GeekWorkBench updated: June 17, 2026

Quick Summary

URL shorteners convert long URLs to compact codes using hash-based generation (URL plus salt run through MD5/SHA) or counter-based key generation (KGS), with base62 encoding giving 56-218 trillion possible codes. Redirect latency is the metric that matters, so Redis caching with write-through on create and differentiated TTLs for hot versus cold links is essential, falling back to PostgreSQL with read replicas on cache misses. Security layers include rate limiting per IP and user tier, malicious URL scanning, and reserved path blocking, while click analytics flow through Kafka asynchronously to keep redirects fast.

System Design: URL Shortener from Scratch

URL shorteners are deceptively simple systems. The core functionality just converts long URLs to short ones and redirects users when they visit the short URL. But building one that handles millions of users with low latency and high availability reveals interesting challenges.

This case study walks through designing a URL shortener like bit.ly or TinyURL.

Introduction

Functional Requirements

Users need to:

Shorten a long URL into a compact link
Access the short URL and get redirected to the original
Optionally set expiration dates
Optionally customize the short code
View statistics on link usage

Non-Functional Requirements

The system must be:

Fast: Redirects under 100ms
Available: Handle service disruptions gracefully
Scalable: Millions of links, billions of redirects
Durable: No lost links

Capacity Estimation

Daily active users: 100 million URL creations Redirect ratio: 100:1 (one creation, one hundred redirects)

Storage needed over 5 years:

100M links/day 365 days 5 years = 182.5 billion links
At 500 bytes per link: ~91 TB

Redirect QPS: 100M * 100 / 86400 = ~115,000 QPS

Core Components

Short Code Generation

The short code is the heart of the system. It needs to be unique, random enough to be unpredictable, short (6-8 characters typical), and URL-safe (alphanumeric). Getting any of these properties wrong has concrete consequences: collisions break redirects, predictability enables enumeration attacks, length wastes characters, special characters require encoding and defeat the purpose of shortening.

What makes short code generation tricky is that the obvious approaches each have a flaw. A database auto-increment ID is unique and sequential, but it exposes your total link count and lets attackers guess other codes by incrementing. A simple hash of the URL is deterministic, so the same URL always produces the same code — which lets someone find every link you’ve ever created by hashing common URLs. A random code generator has to handle the collision case where it generates something already taken.

The three practical approaches each solve different parts of the problem.

The first approach hashes the long URL with a salt. Take the URL, append a secret salt, run it through MD5 or SHA-1, take the first 8 characters, encode in base62. This gives you a fixed-length code from any URL. The catch: same URL plus same salt always produces the same code, so someone who knows your salt can precompute hashes for common URLs and match them against your shortened links.

The second approach adds randomness to break determinism. Hash the URL plus a timestamp plus a random number. Now the same URL can produce different codes on different calls, which stops enumeration but introduces collision risk — you might generate a code that’s already in use. That means a database check on every insert, and that check adds latency to the hot path.

The third approach sidesteps randomness entirely. A Key Generation Service (KGS) holds a counter and hands out pre-generated codes. Each code gets claimed in the database before it goes to the user, so collisions cannot happen. There is no collision check on the critical path, codes are uniformly distributed, and the KGS can pre-warm its available pool for fast response. The tradeoff is that KGS becomes a critical dependency. If it goes down, you cannot create new links. You need a warm standby or a hash-based fallback.

The choice comes down to how much operational complexity you can handle versus how much collision risk you can tolerate. KGS is the right call for anything serving real production traffic. Hash-based with salt works fine for internal tools or low-volume use cases where occasional collision checks are acceptable.

Hash Function Approaches

Three approaches generate short codes:

Approach 1: MD5/SHA hash of long URL + salt

import hashlib
import base62

def generate_short_code(url: str, salt: str = "mysalt") -> str:
    hash_input = f"{url}:{salt}"
    md5_hash = hashlib.md5(hash_input.encode()).hexdigest()
    # Take first 8 characters and encode in base62
    return base62.encode(int(md5_hash[:8], 16))[:8]

# Example
short = generate_short_code("https://example.com/very/long/url/path")
# Result: "xV2bP9qK"

Problem: Same URL always produces same hash, enabling URL enumeration attacks.

Approach 2: Hash + counter for uniqueness

import hashlib
import base62
import time

def generate_short_code(url: str, salt: str = "mysalt") -> str:
    # Combine URL with timestamp and random to ensure uniqueness
    combined = f"{url}:{time.time_ns()}:{random.randint(0, 999999)}"
    md5_hash = hashlib.md5(combined.encode()).hexdigest()
    return base62.encode(int(md5_hash[:8], 16))[:8]

Approach 3: Counter-based (KGS approach)

Use a Key Generation Service that pre-generates short codes:

class KeyGenerationService:
    def __init__(self, batch_size=1000):
        self.batch_size = batch_size
        self.available_keys = []

    def get_next_key(self) -> str:
        if not self.available_keys:
            self._refill_batch()
        return self.available_keys.pop()

    def _refill_batch(self):
        # Generate batch from counter
        start = self._get_current_counter()
        for i in range(start, start + self.batch_size):
            self.available_keys.append(base62.encode(i))
        self._increment_counter(start + self.batch_size)

The counter approach guarantees uniqueness and allows easy key management.

Base62 Encoding

Base62 uses characters 0-9, A-Z, a-z giving 62 characters per position:

Length	Possible Combinations	Equivalent URLs
6	62^6 = 56.8 billion	Enough for all links
7	62^7 = 3.5 trillion	Generous headroom
8	62^8 = 218 trillion	Future-proof

Data Model

Relational Schema

CREATE TABLE urls (
    id BIGSERIAL PRIMARY KEY,
    short_code VARCHAR(12) NOT NULL UNIQUE,
    original_url TEXT NOT NULL,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
    expires_at TIMESTAMP WITH TIME ZONE,
    is_custom BOOLEAN DEFAULT FALSE,
    creator_id BIGINT,
    click_count BIGINT DEFAULT 0,
    is_active BOOLEAN DEFAULT TRUE,

    CONSTRAINT urls_short_code_idx UNIQUE (short_code)
);

CREATE INDEX idx_urls_short_code ON urls(short_code);
CREATE INDEX idx_urls_creator ON urls(creator_id);
CREATE INDEX idx_urls_expires ON urls(expires_at) WHERE expires_at IS NOT NULL;

NoSQL Alternative (DynamoDB)

{
  "TableName": "urls",
  "KeySchema": [{ "AttributeName": "short_code", "KeyType": "HASH" }],
  "AttributeDefinitions": [
    { "AttributeName": "short_code", "AttributeType": "S" },
    { "AttributeName": "creator_id", "AttributeType": "N" }
  ],
  "GlobalSecondaryIndexes": [
    {
      "IndexName": "creator-index",
      "KeySchema": [{ "AttributeName": "creator_id", "KeyType": "HASH" }],
      "Projection": { "ProjectionType": "ALL" },
      "ProvisionedThroughput": {
        "ReadCapacityUnits": 100,
        "WriteCapacityUnits": 50
      }
    }
  ],
  "ProvisionedThroughput": {
    "ReadCapacityUnits": 1000,
    "WriteCapacityUnits": 500
  }
}

Caching Strategy

Redirect latency is critical. Cache aggressively.

Cache Structure

# Redis cache key pattern
cache_key = f"url:{short_code}"

# Cache value
cache_value = {
    "original_url": "https://example.com/very/long/path",
    "expires_at": "2027-01-01T00:00:00Z",
    "is_active": True
}

Cache TTL Strategy

CACHE_TTL = {
    "frequently_accessed": 3600,    # 1 hour for popular links
    "recently_created": 300,        # 5 minutes for new links
    "custom": 86400,                # 24 hours for custom links
    "expired": 60                   # 1 minute for recently expired
}

Write-Through Cache

async def create_short_url(url: str, custom_code: str = None) -> str:
    short_code = custom_code or generate_short_code(url)

    # Write to database
    await db.urls.create({
        "short_code": short_code,
        "original_url": url,
        "is_custom": custom_code is not None
    })

    # Write to cache
    await cache.set(f"url:{short_code}", {
        "original_url": url,
        "expires_at": None,
        "is_active": True
    }, ttl=CACHE_TTL["recently_created"])

    return short_code

Cache Miss Handling

async def get_original_url(short_code: str) -> Optional[str]:
    # Check cache first
    cached = await cache.get(f"url:{short_code}")
    if cached:
        return cached["original_url"]

    # Cache miss - fetch from database
    url_record = await db.urls.get(short_code=short_code)

    if not url_record:
        return None

    # Populate cache
    await cache.set(f"url:{short_code}", {
        "original_url": url_record.original_url,
        "expires_at": url_record.expires_at,
        "is_active": url_record.is_active
    }, ttl=CACHE_TTL["recently_created"])

    return url_record.original_url

Redirect Logic

HTTP Redirect Types

Status	Use Case	Browser Behavior
301	Permanent move	Caches redirect
302	Temporary redirect	No cache
303	Post -> Get	Converts to GET
307	Temporary	Preserves method
308	Permanent	Preserves method

For URL shorteners, typically use 301 (permanent) for SEO or 302 (temporary) for analytics tracking.

Redirect Handler

from fastapi import FastAPI, HTTPException, status
from fastapi.responses import RedirectResponse

app = FastAPI()

@app.get("/{short_code}")
async def redirect_to_original(short_code: str):
    # Check for special paths
    if short_code in ["health", "metrics", "docs"]:
        raise HTTPException(status_code=404)

    # Validate short code format
    if not is_valid_short_code(short_code):
        raise HTTPException(status_code=400, detail="Invalid short code")

    # Get original URL
    original_url = await get_original_url(short_code)

    if not original_url:
        raise HTTPException(status_code=404, detail="URL not found")

    # Track click asynchronously
    asyncio.create_task(track_click(short_code))

    return RedirectResponse(
        url=original_url,
        status_code=status.HTTP_302_FOUND
    )

Rate Limiting

Prevent abuse with rate limits per IP:

from slowapi import Limiter
from slowapi.util import get_remote_address

limiter = Limiter(key_func=get_remote_address)

@app.post("/shorten")
@limiter.limit("10/minute")
async def create_short_url(request: Request, url: str = Body(...)):
    # Validate URL
    if not is_valid_url(url):
        raise HTTPException(status_code=400, detail="Invalid URL")

    # Check if already shortened by this user
    existing = await find_existing_mapping(url, request.client.host)
    if existing:
        return {"short_url": f"https://short.ly/{existing.short_code}"}

    short_code = await create_mapping(url)
    return {"short_url": f"https://short.ly/{short_code}"}

High Availability Design

Database High Availability

-- PostgreSQL synchronous replication
ALTER SYSTEM SET synchronous_commit = on;
ALTER SYSTEM SET synchronous_standby_names = '*';

-- Read replicas for redirects
CREATE PUBLICATION url_shares FOR TABLE urls;

-- On replica
CREATE SUBSCRIPTION url_sub CONNECTION 'host=primary port=5432 dbname=urlshort' PUBLICATION url_shares;

Multiple Redis Instances

from rediscluster import RedisCluster

# Redis Cluster configuration
rc = RedisCluster(
    startup_nodes=[
        {"host": "redis-1", "port": 6379},
        {"host": "redis-2", "port": 6379},
        {"host": "redis-3", "port": 6379}
    ],
    decode_responses=True
)

async def get_original_url(short_code: str) -> Optional[str]:
    # Consistent hashing handles failover automatically
    cached = await rc.get(f"url:{short_code}")
    return json.loads(cached)["original_url"] if cached else None

Geographic Distribution

Deploy redirector clusters in multiple regions:

# Route 53 latency routing
- Name: short.ly
  Type: A
  SetIdentifier: us-east-1
  Region: us-east-1
  AliasTarget:
    DNSName: dualstack.api-elb-us-east-1.amazonaws.com
    EvaluateTargetHealth: true

# EU redirector
- Name: short.ly
  Type: A
  SetIdentifier: eu-west-1
  Region: eu-west-1
  AliasTarget:
    DNSName: dualstack.api-elb-eu-west-1.amazonaws.com
    EvaluateTargetHealth: true

Users are routed to the nearest cluster based on latency.

Analytics Pipeline

Track clicks without slowing redirects:

async def track_click(short_code: str):
    # Fire and forget - don't await
    asyncio.ensure_future(
        kafka.send("clicks", {
            "short_code": short_code,
            "timestamp": datetime.utcnow().isoformat(),
            "user_agent": request.headers.get("user-agent"),
            "referer": request.headers.get("referer"),
            "ip_hash": hash_ip(request.client.host)
        })
    )

Click Analytics Consumer

async def process_clicks():
    consumer = KafkaConsumer("clicks", bootstrap_servers=["kafka:9092"])

    for message in consumer:
        event = json.loads(message.value)

        # Update click count in background
        await db.query("""
            UPDATE urls
            SET click_count = click_count + 1
            WHERE short_code = $1
        """, event["short_code"])

        # Update analytics warehouse
        await warehouse.insert("click_events", event)

Complete API Specification

Endpoints

Method	Endpoint	Description
POST	/api/v1/shorten	Create short URL
GET	/{short_code}	Redirect to original
GET	/api/v1/links/{short_code}	Get link info
GET	/api/v1/links/{short_code}/stats	Get click statistics
DELETE	/api/v1/links/{short_code}	Delete a link
PUT	/api/v1/links/{short_code}	Update link settings

Request/Response Examples

# Create short URL
curl -X POST https://short.ly/api/v1/shorten \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com/very/long/path/that/needs/shortening"}'

# Response
{
  "short_code": "xV2bP9qK",
  "short_url": "https://short.ly/xV2bP9qK",
  "original_url": "https://example.com/very/long/path/that/needs/shortening",
  "created_at": "2026-03-22T10:30:00Z",
  "expires_at": null
}

Abuse Prevention and Security

Malicious URL Detection

URL shorteners are frequently abused for phishing, malware distribution, and spam. Implement safeguards:

class MaliciousURLDetector:
    """Detect potentially malicious URLs before shortening"""

    def __init__(self, threat_intel_client: ThreatIntelClient):
        self.threat_intel = threat_intel_client
        self.suspicious_tlds = {
            '.tk', '.ml', '.ga', '.cf', '.gq',  # Free tier often abused
            '.xyz', '.top', '.club'  # Often used in spam
        }

    async def check_url(self, url: str) -> ThreatAssessment:
        checks = await asyncio.gather(
            self._check_domain_reputation(url),
            self._check_url_pattern(url),
            self._check_content_scan(url),
            self._check_google_safe_browsing(url)
        )

        if any(check.threat for check in checks):
            return ThreatAssessment(
                threat=True,
                reason="URL flagged by security checks",
                severity="high"
            )

        return ThreatAssessment(threat=False)

    async def _check_url_pattern(self, url: str) -> CheckResult:
        parsed = urlparse(url)

        # Check for suspicious TLDs
        if any(parsed.netloc.endswith(tld) for tld in self.suspicious_tlds):
            return CheckResult(threat=True, reason="Suspicious TLD")

        # Check for IP address instead of domain
        if self._is_ip_address(parsed.netloc):
            return CheckResult(threat=True, reason="IP address used")

        # Check for excessive subdomains
        if parsed.netloc.count('.') > 4:
            return CheckResult(threat=True, reason="Excessive subdomains")

        return CheckResult(threat=False)

Rate Limiting Tiers

RATE_LIMITS = {
    "anonymous": {"shorten": "5/hour", "redirect": "100/hour"},
    "authenticated_free": {"shorten": "100/hour", "redirect": "1000/hour"},
    "authenticated_pro": {"shorten": "10000/hour", "redirect": "100000/hour"},
}

@app.middleware
async def rate_limit_middleware(request: Request, call_next):
    user_tier = get_user_tier(request)

    if user_tier == "anonymous":
        # Rate limit by IP
        client_ip = request.client.host
        if not await rate_limiter.check_limit(f"ip:{client_ip}", RATE_LIMITS["anonymous"]["shorten"]):
            raise HTTPException(status_code=429, detail="Rate limit exceeded")

    elif user_tier == "authenticated":
        user_id = get_user_id(request)
        if not await rate_limiter.check_limit(f"user:{user_id}", RATE_LIMITS[user_tier]["shorten"]):
            raise HTTPException(status_code=429, detail="Rate limit exceeded")

    return await call_next(request)

Spam Link Prevention

async def create_short_url(url: str, user_id: int = None) -> ShortUrl:
    # Validate URL format
    if not is_valid_url(url):
        raise HTTPException(status_code=400, detail="Invalid URL format")

    # Check for known spam domains
    if await spam_database.is_spam_domain(extract_domain(url)):
        raise HTTPException(status_code=403, detail="URL blocked")

    # Require authentication for custom codes
    if custom_code and not user_id:
        raise HTTPException(status_code=401, detail="Authentication required for custom codes")

    # Create short URL
    return await url_service.create(url, custom_code, user_id)

Production Failure Scenarios

Failure Scenario	Impact	Mitigation
Redis cache failure	All redirects hit DB, high latency	Fallback to direct DB reads; circuit breaker on cache
KGS (key gen) failure	Cannot create new short URLs	Use hash-based codes as fallback; KGS recovery priority
Database primary failure	Cannot create or redirect	Promote read replica; use eventual consistency for analytics
DNS resolution failure	short.ly domain unreachable	Multi-cloud DNS; anycast IP; aggressive caching
CDN failure for stats page	Stats load slowly	Static asset caching; local caching

Cache Failure Handling

async def get_original_url(short_code: str) -> Optional[str]:
    try:
        # Try cache first
        cached = await redis.get(f"url:{short_code}")
        if cached:
            return json.loads(cached)["original_url"]
    except RedisConnectionError:
        # Cache unavailable - fall through to DB
        pass

    # Fallback to database
    url_record = await db.urls.get(short_code=short_code)
    if not url_record:
        return None

    # Don't try to repopulate cache if Redis is down
    return url_record.original_url

Real-world Failure Scenarios

Scenario 1: Bitly Outage (2010)

What happened: Bitly suffered a major outage affecting all URL shortening services for several hours. The incident exposed critical infrastructure weaknesses in their failover mechanisms.

Root cause: Inadequate redundancy in DNS configuration combined with a cascading failure when the primary database cluster became unavailable. No automatic failover was configured.

Impact: All shortened links returned errors, affecting millions of redirected URLs across social media and marketing campaigns. Service was completely unavailable for ~4 hours.

Lesson learned: Design for failure at every layer. Implement multi-region failover, regular chaos testing, and ensure DNS failover is automatic and tested.

Scenario 2: AWS S3 Availability Zone Failure

What happened: A major cloud provider experienced a partial Availability Zone failure that knocked out access to S3 bucket data in one AZ. Services relying on single-AZ S3 configurations went dark.

Root cause: Some URL shortening services configured S3 as the primary storage backend without cross-AZ replication. When the AZ failed, all read/write operations failed.

Impact: Services storing shortened link metadata and redirect targets in a single AZ experienced complete data unavailability, even though other AZs remained healthy.

Lesson learned: Always use S3 cross-region replication for critical data. Design storage backends to tolerate AZ failures. Test failure scenarios regularly.

Scenario 3: Database Connection Pool Exhaustion

What happened: A popular URL shortener experienced a spike in traffic during a major sporting event, causing database connection pool exhaustion and complete service degradation.

Root cause: The connection pool size was configured based on normal traffic patterns. During the traffic spike, all available connections were consumed by read operations, blocking write operations needed for creating new shortened URLs.

Impact: Users could not create new shortened links even though existing redirects continued to work. The service appeared functional but was effectively in a degraded state for several hours.

Lesson learned: Implement connection pool monitoring, use read replicas to offload read traffic, and configure automatic scaling for database connection pools based on real-time demand metrics.

Common Pitfalls / Anti-Patterns

Pitfall 1: Using Sequential IDs as Short Codes

Problem: Sequential IDs (1, 2, 3…) allow URL enumeration - attackers can guess other short URLs.

The enumeration risk is concrete. If your short codes are sequential integers encoded in base62, the sequence is obvious: 1, 2, 3. An attacker who creates one short URL and receives code “B” can trivially guess that codes “C”, “D”, “E” and so on exist. Scanning through those codes costs nothing — the attacker just sends GET requests until one returns a 302 instead of 404. At 1,000 guesses per second, an attacker enumerates your entire URL database in days.

Beyond enumeration, sequential IDs expose your business metrics. The ID tells an attacker exactly how many URLs you have created and when. A 10x growth in your URL count between January and March is competitive intelligence that should not be publicly accessible. Sequential IDs also make it trivial to find the oldest or newest links in your system, which has privacy implications for the users who created them.

Solution: Use cryptographically random codes (base62) with minimum 6 characters. Use KGS for guaranteed uniqueness without predictability.

The fix is randomness at sufficient entropy. Six base62 characters give you 56.8 billion possible codes — far too many to guess. Each code is generated independently, so knowing one says nothing about the next. A Key Generation Service (KGS) takes this further by pre-generating codes and claiming them atomically in the database. By the time a code reaches a user, it is already marked as used. An attacker who guesses a valid code finds it already redirects somewhere — not a useful signal for enumeration.

Pitfall 2: Not Handling URL Expiration

Problem: Expired URLs still redirect until cache expires.

The cache is where this problem hides. When a URL expires in your database, you mark it inactive or set expires_at to a past timestamp. But if that URL was recently accessed, it sits in Redis with a TTL of several minutes or more. During that window, any user who clicks the short link gets a 302 to the original URL — even though that original URL may now be a phishing site, a deleted resource, or simply a campaign that ended. The expiration exists in your database but not in your cache.

The second-order problem is that expired URLs can poison caches across CDN edges. If you cache aggressively at the CDN layer, an expired redirect can propagate to edge nodes before your invalidation reaches them. Users in different geographic regions may get inconsistent behavior — some hit a CDN node with the cached redirect, others hit your origin and get a 404.

Solution: Check expiration on every redirect. Set aggressive cache TTL for URLs with near-term expiration.

The fix is to treat expires_at as part of the cache key, not just a database column. When storing a URL in cache, include the expiration timestamp in the cached object. On every redirect, check that expires_at is null or in the future before returning a redirect. For URLs expiring within 24 hours, use a cache TTL of 60 seconds or less — short enough that the stale redirect window is negligible. Background jobs handle the actual database cleanup; the redirect handler checks expiration on every request regardless of cache state.

Pitfall 3: Storing Only Short Code in Cache

Problem: Cache miss requires DB query for every redirect.

A cache miss on a redirect is expensive. The cache returns nothing, so you query the database with the short code. That database lookup takes 1-5ms on a fast PostgreSQL instance with proper indexes, versus sub-millisecond latency from Redis. If your cache hit ratio is 90%, this is tolerable — 10% of redirects are slow. If it drops to 70%, 30% of your redirect traffic is hitting the database on every request. At 115,000 QPS, 30% is 34,500 database queries per second. Each one adds 1-5ms of latency to those requests.

The other issue is what you cache. If you cache only the short code-to-URL mapping, you save a database lookup. But if you need to check expiration, log the click, or update analytics, you still need a second round-trip to fetch that metadata. Caching only the short code trades one database query for one cache hit — it does not eliminate the metadata fetch.

Solution: Cache generously. Use “recently accessed” eviction. Pre-populate cache for trending links.

Cache the full URL object, not just the redirect target. Store the original URL, expiration timestamp, active flag, and click count together. On a cache hit you have everything you need for the redirect and any metadata checks in a single read. On a cache miss you fetch the full object from the database and populate the cache.

Hot links deserve longer TTLs. The top 1% of URLs by traffic account for 99% of redirect volume. Pre-populating cache for those links — either on a schedule or as part of a background job that promotes frequently-accessed URLs — keeps the hot path in memory. For everything else, a TTL of 5-10 minutes is enough to catch most repeated access patterns without holding stale data.

Observability Checklist

Metrics to Capture

url_redirects_total (counter) - By short_code prefix, status code
url_shortens_total (counter) - By user_tier, custom vs auto
redirect_latency_seconds (histogram) - P50, P95, P99
cache_hit_ratio (gauge) - Cache efficiency
malicious_url_attempts_total (counter) - Blocked attempts by type
kgs_available_keys (gauge) - Key generation health

Logs to Emit

{
  "timestamp": "2026-03-22T10:30:00Z",
  "event": "redirect",
  "short_code": "xV2bP9qK",
  "status": 302,
  "latency_ms": 12,
  "cache_hit": true,
  "user_ip_hash": "abc123"
}

Alerts to Configure

Alert	Threshold	Severity
Redirect latency P99 > 200ms	200ms	Warning
Cache hit ratio < 50%	50%	Warning
KGS keys < 1000	1000	Critical
Malicious attempts spike	> 100/min	Warning
DB connection pool > 80%	80%	Warning

Security Checklist

TLS 1.3 for all connections
URL validation and sanitization
Malicious URL scanning (Google Safe Browsing API)
Rate limiting per IP and per user
Custom code length and character validation
Authentication required for custom short codes
Audit logging of all URL creations
GDPR compliance for analytics data
Regular security audits

Trade-off Analysis

Design Decision	Option A	Option B	Recommendation
Short Code Generation	Hash-based (MD5/SHA + salt)	Counter-based KGS	KGS for production; hash-based acceptable for simple cases
Code Length	6 chars (56.8B combos)	7-8 chars (3.5T - 218T)	7-8 chars to avoid birthday paradox collisions
Redirect Status	301 Permanent	302 Temporary	302 to avoid SEO consolidation issues
Cache Strategy	Cache URL only	Cache full URL object with metadata	Cache full object to avoid second round-trip
Custom Codes	Require authentication	Allow anonymous	Require authentication + premium pricing
URL Deduplication	Per-user dedup	Allow duplicates	Per-user dedup for efficiency
Analytics Tracking	Synchronous write	Async fire-and-forget	Always async to keep redirect fast
Database	PostgreSQL	DynamoDB/Cassandra	PostgreSQL for <1B URLs; NoSQL for billions
KGS Recovery	Return orphaned codes after timeout	Fall back to hash-based codes	Both strategies needed

When to Choose Each Option

Hash vs KGS for Code Generation:

Use hash-based when: simplicity is priority, collision handling is acceptable, moderate scale
Use KGS when: predictability required, high availability critical, cannot tolerate collision checking latency

PostgreSQL vs NoSQL:

Use PostgreSQL when: need ACID transactions, moderate scale, team expertise available
Use DynamoDB/Cassandra when: billions of rows, global distribution, multi-region writes required

Interview Questions

1. How does a URL shortener handle the hash collision problem?

The cleanest solution is a Key Generation Service (KGS) — a separate service that pre-generates short codes and hands them out on demand. Since the codes come from a known set, collisions literally cannot happen. The tradeoff is that KGS becomes a critical dependency; if it goes down, you cannot create new URLs.

Other approaches exist. You could salt the URL and hash it, then check whether that hash has been used before storing. Or you could use a counter — each new URL gets the next number in sequence, encoded in base62. The counter approach is simple but predictable, which is its own problem.

2. Why does base62 encoding (not base64) work better for URL shorteners?

Base64 uses characters like '+', '/', and '=' that have special meaning in URLs. A base64 character in a URL needs encoding, which defeats the purpose of a short URL — you'd end up with something like example.com/abc%2Bdef%3D instead of example.com/abcdef.

Base62 sticks to alphanumeric characters only (A-Z, a-z, 0-9). Every character is URL-safe without any encoding. Six characters gives you 56.8 billion combinations — more than enough for any practical use case.

3. How does a URL shortener handle the security risk of predictable short codes?

Predictable codes invite enumeration attacks — if I can guess xV2bP9, I can probably guess xV2bP8 and xV2bP7. From there, it's a short hop to scraping the entire URL database.

The fix is randomness. Six characters from base62 gives you 56.8 billion possible codes — too many to guess. The KGS approach generates codes with a cryptographically secure random function before they're needed. If you're hashing URLs instead, you need a secret salt and a strong hash; without the salt, attackers can precompute popular URL hashes.

Never use sequential IDs. They're predictable and they tell attackers exactly how many URLs you've created.

4. What caching strategy minimizes redirect latency in a URL shortener?

Redirect latency is the whole game here. A cache miss means a database lookup, which can be 10-100x slower.

The practical approach: cache the full URL object, not just the short code. On a cache miss you fetch the whole record; on a hit you return immediately. Write-through on create so new URLs are immediately available. Hot links (the top 1% of URLs by traffic) deserve longer TTLs — sometimes hours instead of minutes. Cold links can expire faster.

One thing teams often get wrong: caching only the redirect URL. If you cache the whole object including metadata, you avoid a second round-trip when you need to check expiration or click counts.

5. How does a URL shortener handle expired URLs?

Expiration is checked on every redirect — the handler verifies that expires_at is either null or in the future. If the link has expired, you return 404.

Cached entries need special handling. A link that expires in 10 minutes should not sit in cache for an hour. One approach: store the TTL alongside the cached object and check expiration at read time. Another: use shorter TTLs for URLs with near-term expiration dates.

Background jobs handle the actual cleanup. You don't need to delete expired entries immediately — they're harmless as long as they don't redirect — but you should clean them up eventually to avoid unbounded table growth.

6. What happens when the Redis cache fails completely?

Everything should keep working, just slower. With a circuit breaker in place, the system detects that Redis is unhealthy and stops attempting cache operations. Redirects fall through to the database directly.

This is why the database schema needs proper indexes — it's the fallback for every request when cache is unavailable. Without those indexes, you'd see timeouts under cache failure, which defeats the purpose of having a fallback at all.

Monitoring is critical here. When cache hit ratio drops suddenly, that's your signal that something is wrong before it becomes a full outage.

7. How does a URL shortener protect against malicious URLs (phishing, malware)?

You've got to check at creation time, not at redirect time. By the time a user clicks a malicious link, damage may already be done.

The layers that matter: pattern validation catches obvious abuse (IP addresses instead of domains, suspicious TLDs). Domain reputation databases catch known bad actors. For thoroughness, scan the URL content when possible — Google Safe Browsing API covers a lot of ground. Rate limiting per IP slows down anyone trying to mass-generate malicious links.

You also need a takedown mechanism for links reported after creation. If someone submits a legitimate URL that later becomes malicious, you need to be able to invalidate it quickly.

8. How does a URL shortener handle custom/vanity short codes?

Custom codes need authentication — otherwise people squat on popular terms. Require login to claim a custom code, validate it meets your format rules (minimum length, allowed characters, not on the reserved list), and charge for it if you're running a business. Premium pricing alone deters most abuse.

The reserved list matters. Things like /admin, /health, /metrics — these should never be available as custom codes regardless of who asks.

9. What database design works best for a URL shortener?

PostgreSQL handles most URL shortener workloads fine. The short_code gets a unique index for fast lookups — this is the only query path that matters for redirects. Everything else (creator_id, expires_at) can live on secondary indexes.

At extreme scale — billions of URLs — PostgreSQL starts showing cracks. Cassandra and DynamoDB both scale horizontally without the manual sharding overhead. The query pattern is simple (look up by short_code), which maps well to partitioned NoSQL designs. You lose some query flexibility but gain almost unbounded horizontal scalability.

10. How does analytics tracking work without slowing down redirects?

The redirect has to be fast. Anything that blocks the response — including analytics writes — adds latency.

The standard pattern: fire-and-forget. The redirect handler returns the 302 immediately and spawns an async task. That task sends the click event to a message queue (Kafka, SQS, whatever you prefer). A separate consumer picks up events in batches and updates the analytics store.

This decouples the hot path from the analytics path entirely. Your p99 redirect latency stays low even when analytics load is high.

11. How does a URL shortener handle the trade-off between code length and collision probability?

Short codes mean shorter URLs, which is the whole point. But each additional character multiplies the available keyspace exponentially.

Six base62 characters give you 56.8 billion combinations. At a million new URLs per day, you'd need 56,800 years to exhaust them — so six characters is plenty for practical purposes. Seven characters bumps you to 3.5 trillion, which is absurdly generous.

The collision risk from shorter codes isn't about exhaustion — it's about birthday paradox math. With 56.8 billion possible codes and 182.5 billion URLs over five years, collisions become statistically likely. This is why most production systems use seven or eight characters despite six being technically sufficient.

12. What are the trade-offs between using a hash-based approach versus a counter-based KGS for short code generation?

Hash-based generation is simple — hash the URL, encode, done. No coordination needed. The problem is predictability and collision handling. Without a salt secret, attackers can precompute hashes for common URLs. With a salt, you still need to handle collisions by appending random data, which complicates the flow.

Counter-based KGS eliminates collisions entirely — codes come from a pre-generated list. It also gives you predictable, uniform codes. The cost is operational complexity: KGS becomes a critical service. If it goes down, you can't create new URLs. You also need to handle KGS recovery after failures — the batch allocation logic must be idempotent.

For high-availability requirements, KGS with a warm standby is worth the complexity. For simpler use cases, hash-based with collision handling is easier to operate.

13. How does a URL shortener handle the case where a user tries to shorten a URL that already exists in the system?

You have two philosophies here. The first: deduplicate and return the existing short URL. If the same user submits the same URL twice, give them the same short code. This is efficient — fewer entries in your database — but it breaks the assumption that each short code belongs to one URL. It also means one user can "claim" a URL that another user wanted.

The second approach: treat each submission as unique, even for identical URLs. This preserves the one-to-one mapping and is simpler to reason about, but wastes keyspace on duplicates.

Most production systems do a hybrid: check if the submitting user already has a short code for that URL (deduplicate per user), but allow different users to create different short codes for the same long URL. The cache and database handling stays simple.

14. What database scaling challenges arise as a URL shortener grows from millions to billions of entries?

PostgreSQL handles millions of rows without complaint if you have the right indexes. The short_code unique index performs well even with hundreds of millions of entries. The problems start when you need to scale writes or distribute geographically.

Read replicas help with redirect read throughput — you can route reads to replicas and writes to primary. But there's a catch: replication lag means occasionally a brand-new URL might not be on a replica yet, so a redirect could 404 even though the URL was just created. Acceptable for most use cases, but something to monitor.

At billions of rows, table partitioning becomes necessary. You can partition by creation date, then route redirects to the right partition based on short code prefix. DynamoDB and Cassandra handle this scale natively with automatic partitioning, but you give up PostgreSQL's query flexibility.

15. How does geographic distribution affect URL shortener consistency guarantees?

Users in different regions expect instant redirects. If your primary database is in us-east-1, a user in Tokyo sees 100ms+ latency just for the database lookup before the redirect even starts.

The usual approach: regional Redis caches that replicate asynchronously. Writes go to the primary region and propagate to regional caches. The trade-off is eventual consistency — a URL created in us-east-1 might not be redirectable from eu-west for a few seconds.

For most URL shorteners, eventual consistency is fine. Users rarely create and immediately test from opposite sides of the world. The p99 latency improvement from regional caching outweighs the brief inconsistency window.

16. What security considerations exist around custom short codes that most designs overlook?

Custom codes bypass the random generation process, which means they can collide with system paths. If someone registers "/health" as a custom code, your redirect handler gets confused — that path is supposed to be your health check endpoint.

The reserved list must be enforced at creation time, not just checked at redirect time. Block /health, /metrics, /docs, /api/*, and any other internal paths. Also block codes that look like valid base62 but decode to ASCII — "admin" encoded in base62 is a real short code someone could claim.

Character validation matters too. A custom code containing SQL injection characters or path traversal sequences ("../../../etc/passwd") needs sanitization even if it passes your base62 filter. Validate strictly: alphanumeric only, minimum 4 characters, maximum 12.

17. How does a URL shortener handle the case where the original URL becomes unavailable or returns an error?

The shortener's job is redirect, not reliability. You redirect to whatever URL the user provided — you're not responsible for that URL being up. But there are legitimate cases where you'd want to detect and handle dead links.

Some systems do proactive validation: when a URL is shortened, scan it to verify it responds without error. This adds creation latency and still doesn't guarantee the URL stays up. Not worth the complexity for most use cases.

For premium tiers, periodic revalidation makes sense. Check every hour whether premium short URLs still respond. If a link goes dead, alert the owner and optionally auto-expire the short code. This is a business feature, not a technical requirement.

18. What happens when KGS (Key Generation Service) fails mid-batch and how do you recover?

KGS allocates codes in batches. Say it allocates codes 1000-2000 to instance A, then crashes. Those codes are "reserved" but not yet assigned to any URL. If you don't handle this, codes 1000-2000 are effectively lost — they exist in the KGS state but no one can claim them.

The standard recovery approach: KGS writes its batch allocation to durable storage (database or distributed lock) before handing out codes. If it crashes, on restart it reads the last known allocation and recovers. Orphaned codes get returned to the available pool after a timeout.

During recovery, new URL creation can fall back to hash-based codes. This keeps the service running, just with less predictable codes. Monitor how often you're falling back — frequent KGS failures indicate it needs more redundancy.

19. How does the choice between 301 and 302 redirects impact SEO and analytics for a URL shortener?

Search engines interpret 301 (permanent) as a signal that the short URL is the "canonical" version. They consolidate page rank from the long URL to the short one. This is good if you want the short URL to rank — but then you're stuck with it. If you ever need to change the mapping, you've lost SEO value.

302 (temporary) tells search engines this is a temporary redirect. They keep both URLs in the index and don't transfer page rank. Better for tracking — you can change the destination without losing the SEO identity of the short URL.

Most URL shorteners use 302 specifically to avoid SEO complications. You don't want your service being treated as the authoritative source for content that lives elsewhere. The trade-off is that search engines don't consolidate ranking signals, which is actually what you want.

20. What capacity planning formulas should every URL shortener designer know?

Redirect QPS estimation: (daily_creates * redirect_ratio) / seconds_per_day. With 100M creates/day and 100:1 redirect ratio, that's 115,000 QPS. Your system needs to handle spikes — 2x average at minimum during viral moments.

Storage estimation: creates_per_day * bytes_per_url * days_retained. At 500 bytes per URL and 5-year retention, 100M creates/day needs 91TB. Account for indexes and overhead — real storage needs are 2-3x the raw calculation.

Cache sizing: hot URL distribution follows zipf's law. The top 1% of URLs get 99% of redirects. Size your cache to hold at least the top 10,000 URLs with their metadata. Cache misses for the long tail are acceptable — those URLs barely get accessed.

Conclusion

A URL shortener looks simple on paper. The complexity shows up at scale. The key decisions:

Short code generation: Counter-based with KGS ensures uniqueness and predictability
Caching: Aggressive caching with Redis handles redirect traffic
Database: PostgreSQL with read replicas for availability
Analytics: Asynchronous click tracking via Kafka

Quick Recap Checklist

Before deploying your URL shortener to production, verify these essentials:

Core Functionality

Short code generation using KGS or hash-with-salt
Base62 encoding with 6-8 character output
URL validation and sanitization
Custom code reservation for system paths
Expiration handling on every redirect

Performance

Redis caching with write-through on create
Cache TTL differentiated by URL type (hot/cold/custom)
Database indexes on short_code for fast lookups
Read replicas for redirect traffic
Async analytics (fire-and-forget click tracking)

Availability

Circuit breaker for cache failures
Database fallback when cache unavailable
Multi-region deployment with latency-based routing
KGS warm standby for high availability
Connection pool monitoring and auto-scaling

Security

Rate limiting per IP and per user tier
Malicious URL scanning (Safe Browsing API)
Authentication required for custom codes
Reserved path blocking at creation time
TLS 1.3 for all connections
Audit logging for URL creations

Observability

Redirect latency histograms (P50, P95, P99)
Cache hit ratio monitoring
KGS available keys gauge
Alert thresholds configured for all critical metrics

System Design: URL Shortener from Scratch

Introduction

Functional Requirements

Non-Functional Requirements

Capacity Estimation

Core Components

Short Code Generation

Hash Function Approaches

Approach 1: MD5/SHA hash of long URL + salt

Approach 2: Hash + counter for uniqueness

Approach 3: Counter-based (KGS approach)

Base62 Encoding

Data Model

Relational Schema

NoSQL Alternative (DynamoDB)

Caching Strategy

Cache Structure

Cache TTL Strategy

Write-Through Cache

Cache Miss Handling

Redirect Logic

HTTP Redirect Types

Redirect Handler

Rate Limiting

High Availability Design

Database High Availability

Multiple Redis Instances

Geographic Distribution

Analytics Pipeline

Click Analytics Consumer

Complete API Specification

Endpoints

Request/Response Examples

Abuse Prevention and Security

Malicious URL Detection

Rate Limiting Tiers

Spam Link Prevention

Production Failure Scenarios

Cache Failure Handling

Real-world Failure Scenarios

Scenario 1: Bitly Outage (2010)

Scenario 2: AWS S3 Availability Zone Failure

Scenario 3: Database Connection Pool Exhaustion

Common Pitfalls / Anti-Patterns

Pitfall 1: Using Sequential IDs as Short Codes

Pitfall 2: Not Handling URL Expiration

Pitfall 3: Storing Only Short Code in Cache

Observability Checklist

Metrics to Capture

Logs to Emit

Alerts to Configure

Security Checklist

Trade-off Analysis

When to Choose Each Option

Interview Questions

Further Reading

Conclusion

Quick Recap Checklist

Core Functionality

Performance

Availability

Security

Observability

Category

Tags

Related Posts

System Design: Netflix Architecture for Global Streaming

System Design: Twitter Feed Architecture and Scalability

Amazon Architecture: Lessons from the Pioneer of Microservices