ELK Stack: Elasticsearch, Logstash, Kibana, and Beats

Complete guide to the ELK Stack for log aggregation and analysis. Learn Elasticsearch indexing, Logstash pipelines, Kibana visualizations, and Beats shippers.

published: reading time: 14 min read

ELK Stack Deep Dive: Elasticsearch, Logstash, Kibana, and Beats

The ELK Stack is a popular open-source solution for centralized logging. It lets you collect logs from multiple sources, transform them into structured format, store them efficiently, and query them interactively.

This guide covers each component in depth. If you are new to logging concepts, start with our Logging Best Practices guide first.

ELK Stack Architecture

graph LR
    A[Log Sources] -->|Shippers| B[Beats]
    B --> C[Logstash]
    C --> D[Elasticsearch]
    D --> E[Kibana]
    A -->|Direct| D

The ELK Stack has four main components:

  • Beats: Lightweight shippers that collect data from various sources
  • Logstash: Transforms and enriches data during transit
  • Elasticsearch: Stores and indexes data for fast search
  • Kibana: Visualizes and explores data

Elasticsearch

Elasticsearch is a distributed search and analytics engine built on Apache Lucene. It stores documents in JSON format and provides powerful query capabilities.

Core Concepts

ConceptDescription
IndexCollection of documents, similar to a database
DocumentA single JSON record, similar to a row
ShardA partition of an index for horizontal scaling
ReplicaA copy of a shard for high availability

Index Lifecycle Management

Define policies to manage index data from creation to deletion:

PUT _ilm/policy/logs-policy
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_age": "7d",
            "max_primary_shard_size": "50gb"
          },
          "set_priority": 100
        }
      },
      "warm": {
        "min_age": "7d",
        "actions": {
          "shrink": {
            "number_of_shards": 1
          },
          "forcemerge": {
            "max_num_segments": 1
          },
          "set_priority": 50
        }
      },
      "cold": {
        "min_age": "30d",
        "actions": {
          "freeze": {},
          "set_priority": 0
        }
      },
      "delete": {
        "min_age": "365d",
        "actions": {
          "delete": {}
        }
      }
    }
  }
}

Mapping and Index Templates

Index templates define mappings and settings for new indices:

PUT _index_template/logs-template
{
  "index_patterns": ["logs-*"],
  "template": {
    "settings": {
      "number_of_shards": 3,
      "number_of_replicas": 1,
      "index.lifecycle.name": "logs-policy"
    },
    "mappings": {
      "properties": {
        "@timestamp": {
          "type": "date"
        },
        "level": {
          "type": "keyword"
        },
        "message": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "service": {
          "type": "keyword"
        },
        "trace_id": {
          "type": "keyword"
        },
        "user_id": {
          "type": "keyword"
        },
        "duration_ms": {
          "type": "long"
        },
        "host": {
          "properties": {
            "name": { "type": "keyword" },
            "ip": { "type": "ip" }
          }
        }
      }
    }
  }
}

Querying Elasticsearch

GET logs-2026.03.22/_search
{
  "query": {
    "bool": {
      "must": [
        { "match": { "service": "api-gateway" } },
        { "range": { "@timestamp": { "gte": "now-1h" } } }
      ],
      "filter": [
        { "term": { "level": "ERROR" } }
      ]
    }
  },
  "sort": [
    { "@timestamp": "desc" }
  ],
  "aggs": {
    "error_by_service": {
      "terms": { "field": "service" },
      "aggs": {
        "error_rate": {
          "avg": { "field": "error_count" }
        }
      }
    }
  }
}

Logstash

Logstash processes and transforms data before it reaches Elasticsearch. It handles complex parsing, enrichment, and filtering.

Logstash Pipeline

graph TB
    A[Input] --> B[Filter]
    B --> C[Output]

A pipeline has three sections: input, filter, and output.

Input Plugins

# Receive logs from Beats
input {
  beats {
    port => 5044
    ssl => true
    ssl_certificate => "/etc/ssl/certs/logstash.crt"
    ssl_key => "/etc/ssl/private/logstash.key"
  }

  # Alternative: direct HTTP
  http {
    port => 8080
    content_type => "application/json"
  }
}

Filter Plugins

Filters transform and enrich data:

filter {
  # Parse JSON logs
  json {
    source => "message"
    target => "parsed"
  }

  # Parse timestamp
  date {
    match => ["parsed.timestamp", "ISO8601"]
    target => "@timestamp"
  }

  # Extract fields from message
  grok {
    match => {
      "parsed.message" => "%{DATA:level}\s*%{DATA:logger}\s*%{GREEDYDATA:log_message}"
    }
    overwrite => ["message"]
  }

  # Add computed fields
  mutate {
    add_field => {
      "environment" => "%{[parsed][env]}"
      "[@metadata][index_prefix]" => "logs-%{[parsed][service]}"
    }
  }

  # Enrich with GeoIP
  geoip {
    source => "[parsed][client_ip]"
    target => "[parsed][geoip]"
    database => "/etc/logstash/GeoLite2-City.mmdb"
  }

  # Parse query string
  kv {
    source => "[parsed][request_params]"
    field_split => "&"
    prefix => "param_"
  }
}

Output Plugins

output {
  elasticsearch {
    hosts => ["https://elasticsearch:9200"]
    manage_template => false
    index => "%{[@metadata][index_prefix]}-%{+YYYY.MM.dd}"
    ssl => true
    cacert => "/etc/ssl/certs/ca.crt"
    user => "${ELASTICSEARCH_USER}"
    password => "${ELASTICSEARCH_PASSWORD}"
  }

  # Also send to stdout for debugging
  stdout {
    codec => rubydebug
  }
}

Complete Pipeline Example

input {
  beats {
    port => 5044
  }
}

filter {
  if [fields][log_type] == "application" {
    json {
      source => "message"
      target => "parsed"
    }

    date {
      match => ["parsed.timestamp", "ISO8601"]
      target => "@timestamp"
    }

    if [parsed][level] {
      mutate {
        add_field => { "level" => "%{parsed[level]}" }
      }
    }

    if [parsed][exception] {
      mutate {
        add_tag => ["error"]
      }
    }
  }

  if [fields][log_type] == "access" {
    grok {
      match => {
        "message" => '%{IPORHOST:client_ip} %{DATA:ident} %{DATA:auth} \[%{HTTPDATE:timestamp}\] "%{WORD:method} %{URIPATHPARAM:request} HTTP/%{NUMBER:httpversion}" %{NUMBER:status:int} %{NUMBER:bytes:int} "%{DATA:referrer}" "%{DATA:user_agent}"'
      }
    }

    date {
      match => ["timestamp", "dd/MMM/yyyy:HH:mm:ss Z"]
      target => "@timestamp"
    }

    geoip {
      source => "client_ip"
      target => "geoip"
    }

    useragent {
      source => "user_agent"
      target => "ua"
    }
  }
}

output {
  if "error" in [tags] {
    elasticsearch {
      hosts => ["https://elasticsearch:9200"]
      index => "logs-error-%{+YYYY.MM.dd}"
    }
  } else {
    elasticsearch {
      hosts => ["https://elasticsearch:9200"]
      index => "logs-%{[fields][log_type]}-%{+YYYY.MM.dd}"
    }
  }
}

Beats

Beats are lightweight data shippers that send data from servers to Logstash or Elasticsearch.

Filebeat

Filebeat tails log files and ships them:

# filebeat.yml
filebeat.inputs:
  - type: log
    enabled: true
    paths:
      - /var/log/containers/*.log
    json:
      keys_under_root: true
      add_error_key: true
      message_key: log
    fields:
      log_type: container
    processors:
      - add_kubernetes_metadata:
          host: ${NODE_NAME}
          matchers:
            - logs_path:
                logs_path: "/var/log/containers/"

  - type: log
    enabled: true
    paths:
      - /var/log/nginx/*.log
    fields:
      log_type: nginx
    processors:
      - add_locale: ~

processors:
  - add_host_metadata:
      when.not.contains.tags: forwarded
  - add_cloud_metadata: ~
  - add_docker_metadata: ~

output.logstash:
  hosts: ["logstash:5044"]
  ssl.enabled: true
  ssl.certificate_authorities: ["/etc/filebeat/ca.crt"]
  ssl.certificate: "/etc/filebeat/filebeat.crt"
  ssl.key: "/etc/filebeat/filebeat.key"

logging.level: info
logging.to_files: true
logging.files:
  path: /var/log/filebeat
  name: filebeat
  keepfiles: 7
  permissions: 0644

Metricbeat

Metricbeat collects system and service metrics:

metricbeat.modules:
  - module: system
    metricsets:
      - cpu
      - memory
      - network
      - process
      - diskio
    period: 10s
    processes: [".*"]

  - module: docker
    metricsets:
      - container
      - cpu
      - diskio
      - healthcheck
      - info
      - memory
      - network
    hosts: ["unix:///var/run/docker.sock"]
    period: 10s

  - module: nginx
    metricsets:
      - stubstatus
    hosts: ["http://nginx:8080/nginx_status"]
    period: 10s

output.elasticsearch:
  hosts: ["https://elasticsearch:9200"]
  ssl.enabled: true
  ssl.certificate_authorities: ["/etc/metricbeat/ca.crt"]

Heartbeat

Heartbeat monitors service availability with synthetic checks:

heartbeat.monitors:
  - type: http
    name: api-health-check
    schedule: "@every 30s"
    urls:
      - https://api.example.com/health
    check.response:
      status: 200
    fields:
      service: api-gateway

  - type: tcp
    name: redis-connectivity
    schedule: "@every 60s"
    hosts: ["redis:6379"]
    timeout: 5s

  - type: icmp
    name: host-ping
    schedule: "@every 5m"
    hosts: ["elasticsearch"]

output.elasticsearch:
  hosts: ["https://elasticsearch:9200"]

Kibana

Kibana provides the visualization and exploration interface for your Elasticsearch data.

Index Pattern Setup

Before exploring data, create an index pattern in Kibana:

  1. Navigate to Management > Stack Management > Index Patterns
  2. Click “Create index pattern”
  3. Enter logs-* as the pattern
  4. Select @timestamp as the time field

Building Visualizations

Error Rate Over Time

{
  "title": "Error Rate",
  "type": "line",
  "params": {
    "type": "line",
    "grid": { "categoryLines": false },
    "categoryAxes": [
      {
        "id": "CategoryAxis-1",
        "type": "category",
        "position": "bottom"
      }
    ],
    "valueAxes": [
      {
        "id": "ValueAxis-1",
        "name": "LeftAxis-1",
        "type": "value",
        "position": "left",
        "scale": {
          "type": "linear",
          "mode": "normal"
        }
      }
    ]
  },
  "aggs": [
    {
      "id": "1",
      "type": "avg",
      "schema": "metric",
      "params": {
        "field": "error_rate"
      }
    },
    {
      "id": "2",
      "type": "date_histogram",
      "schema": "segment",
      "params": {
        "field": "@timestamp",
        "interval": "auto"
      }
    }
  ]
}

Service Error Distribution

{
  "title": "Errors by Service",
  "type": "pie",
  "aggs": [
    {
      "id": "1",
      "type": "count",
      "schema": "metric"
    },
    {
      "id": "2",
      "type": "terms",
      "schema": "segment",
      "params": {
        "field": "service.keyword",
        "size": 10
      }
    }
  ]
}

Kibana Discover

Discover provides ad-hoc search and exploration:

// Sample Discover query
{
  "query": {
    "bool": {
      "must": [
        { "match": { "level": "ERROR" } },
        { "range": { "@timestamp": { "gte": "now-24h" } } }
      ]
    }
  },
  "sort": [{ "@timestamp": "desc" }],
  "fields": ["@timestamp", "level", "message", "service", "trace_id"],
  "filter": [
    {
      "meta": {
        "index": "logs-*",
        "negate": false,
        "params": {},
        "type": "phrase"
      },
      "query": {
        "match_phrase": {
          "service": "api-gateway"
        }
      }
    }
  ]
}

Kibana Dashboard Example

A complete dashboard might include:

  • Time series of log volume by level
  • Pie chart of error distribution by service
  • Table of recent errors with context
  • Heat map of errors over time by host
  • Metric visualization of error rate and latency percentiles

Deployment Considerations

Hardware Requirements

ComponentCPURAMDisk
Elasticsearch (per node)4+ cores8GB+SSD, 500GB+
Logstash2+ cores4GB+Minimal
Kibana2 cores2GB+Minimal
Beats1 core512MB+Minimal

Elasticsearch is I/O intensive. Use SSDs and ensure adequate disk throughput.

Security

# Enable security in elasticsearch.yml
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.http.ssl.enabled: true

# API key authentication
xpack.security.api.key.enabled: true

# Role-based access control
xpack.security.authorization:
  roles_path: /etc/elasticsearch/roles.yml

Scaling

Scale Elasticsearch horizontally by adding nodes. The cluster automatically rebalances shards.

# Minimum master nodes for cluster stability
discovery.zen.minimum_master_nodes: 2  # for 3-node cluster

# Adjust shard allocation
PUT _cluster/settings
{
  "transient": {
    "cluster.routing.allocation.enable": "all",
    "cluster.routing.allocation.cluster_concurrent_rebalance": 2
  }
}

When to Use the ELK Stack

Use the ELK Stack when:

  • You need centralized logging from multiple services and environments
  • You need full-text search across log entries and application data
  • You need log analysis and pattern detection with Kibana
  • You need security analytics and threat detection
  • You need compliance audit logging and archival
  • You need infrastructure log aggregation (syslog, nginx, apache)

Don’t use the ELK Stack when:

  • You have simple applications with minimal logging needs
  • You only need metrics and dashboards (use Prometheus + Grafana instead)
  • You have high-volume streaming use cases (Kafka is better suited)
  • You need real-time alerting on log data (use dedicated alerting tools)
  • You need large-scale time-series metrics (Elasticsearch is not optimized for pure metrics)

ELK Stack vs Alternatives

AspectELK StackLokiSplunk
CostOpen source (self-hosted)Open source (self-hosted)Commercial (expensive)
Storage efficiencyMedium (indexed)High (log-structured)Medium
Query languageKQL (Kibana)LogQL (Prometheus-style)SPL
ScalabilityExcellent (horizontal)ExcellentExcellent
Ease of setupModerateEasyEasy
Full-text searchExcellentLimitedExcellent
Metrics integrationVia MetricbeatNative PrometheusNative
Best forComplex log analysis, security analyticsHigh-volume Kubernetes logsEnterprise compliance, security

Production Failure Scenarios

FailureImpactMitigation
Elasticsearch cluster red/yellowLogs not indexing; search degradedMonitor cluster health; provision more shards; adjust replica settings
Logstash pipeline errorsLogs stuck in queue; processing backlogMonitor pipeline errors; implement dead-letter queues; alert on queue depth
Hot tier disk saturationNew indices cannot be created; ingestion failsMonitor disk usage; implement ILM rollover; add nodes
Kibana performance degradationSlow searches; dashboards timeoutOptimize queries; use filter context; limit time ranges
Beats shipper failureLogs not forwarded; blind spots in coverageMonitor Beats health; implement local buffering; alert on forward failures
Index template mismatchFields not indexed correctly; search failuresVersion index templates; validate mappings; test before deployment

Observability Checklist

Infrastructure Monitoring

  • Elasticsearch cluster health (green/yellow/red)
  • Primary shard and replica distribution
  • Index count and size per index
  • Node resource utilization (CPU, heap, disk)
  • Search and indexing latency percentiles
  • JVM heap usage and GC frequency
  • Segment count and merge queue depth

Log Pipeline Monitoring

  • Beats shipper metrics (bytes sent, errors, lag)
  • Logstash pipeline throughput and latency
  • Logstash queue depth and worker utilization
  • Dead-letter queue size and age
  • Log parsing error rate

Kibana Monitoring

  • Search response time (p95, p99)
  • Dashboard load time
  • Visualization render time
  • Active users and session count

Data Management

  • Index count within expected bounds
  • Document count growth rate
  • Disk usage trend and forecasting
  • ILM policy execution success/failure
  • Archive tier accessibility

Security Checklist

  • Elasticsearch security enabled (XPack Security)
  • User authentication configured (LDAP, SAML, or built-in)
  • Role-based access control for indices and spaces
  • TLS encryption for all network traffic
  • API keys rotated regularly
  • Kibana spaces isolation (dev/staging/prod separation)
  • Audit logging enabled for security events
  • No sensitive data in index names or field names
  • Snapshot repositories secured and access logged
  • Cross-cluster search secured if used

Common Pitfalls / Anti-Patterns

1. Too Many Indices with Few Documents

Each index has overhead. Too many small indices overwhelms the cluster:

// Bad: Index per day per service creates thousands of indices
PUT logs-service-a-2026.03.22
PUT logs-service-b-2026.03.22
// ... thousands more

// Good: Use rollover with larger time intervals
PUT logs-service-a
{
  "aliases": {
    "logs-service-a": { "is_write_index": true }
  }
}

2. Dynamic Field Mapping Without Controls

Dynamic mapping can create unexpected field types and blow up cardinality:

// Bad: Unrestricted dynamic mapping
{
  "mappings": {
    "dynamic": "true" // Creates any field
  }
}

// Good: Strict dynamic mapping or disabled
{
  "mappings": {
    "dynamic": "strict",
    "properties": {
      "@timestamp": { "type": "date" },
      "level": { "type": "keyword" },
      "message": { "type": "text" }
    }
  }
}

3. Not Using Filter Context for Simple Queries

Filter context is faster because it does not score:

// Bad: Query context for term filter
{
  "query": {
    "match": { "level": "ERROR" } // Scores, slower
  }
}

// Good: Filter context for exact match
{
  "query": {
    "bool": {
      "filter": [
        { "term": { "level": "ERROR" } } // No scoring, faster
      ]
    }
  }
}

4. Ignoring Index Lifecycle Management

Without ILM, indices grow unbounded and performance degrades:

// Good: ILM with hot/warm/cold/delete
{
  "policy": {
    "phases": {
      "hot": {
        "min_age": "0ms",
        "actions": { "rollover": { "max_age": "7d" } }
      },
      "warm": { "min_age": "7d", "actions": { "shrink": 1, "forcemerge": 1 } },
      "cold": { "min_age": "30d", "actions": { "freeze": {} } },
      "delete": { "min_age": "365d", "actions": { "delete": {} } }
    }
  }
}

5. Loading Too Much Data into Memory

Kibana visualizations on large time ranges cause OOM:

// Bad: Visualize 90 days of minute-level data
{
  "query": { "range": { "@timestamp": { "gte": "now-90d" } } }
}

// Good: Use date histogram with appropriate interval
{
  "aggs": {
    "over_time": {
      "date_histogram": {
        "field": "@timestamp",
        "fixed_interval": "1h" // Or auto with proper configuration
      }
    }
  }
}

Quick Recap

Key Takeaways:

  • Beats collect, Logstash transforms, Elasticsearch stores, Kibana visualizes
  • Index lifecycle management prevents unbounded growth
  • Use filter context for exact matches; query context only when scoring needed
  • Monitor cluster health and pipeline metrics proactively
  • Implement security early: authentication, TLS, RBAC
  • Design index templates carefully to control field mapping

Copy/Paste Checklist:

# Check cluster health
GET _cluster/health?pretty

# Monitor index size and document count
GET _cat/indices?v&s=store.size:desc

# Check Logstash pipeline status
GET _nodes/stats/ingest?filter_path=nodes.*.ingest

# ILM policy check
GET _ilm/policy/logs-policy?pretty

# Dead letter queue inspection
GET _all/_doc/_search?q=tags:_dead_letter_queue

# Index template validation
GET _index_template/logs-template?pretty

# Secure your cluster (Elasticsearch)
PUT _security/user/kibana_admin
{
  "password": "${KIBANA_PASSWORD}",
  "roles": ["kibana_admin"]
}

Conclusion

The ELK Stack provides a powerful platform for centralized logging and analysis. Beats collect data efficiently, Logstash transforms it into structured format, Elasticsearch stores and indexes it, and Kibana makes it explorable.

Start with Filebeat shipping container logs to Elasticsearch, and build from there. Add Logstash for complex parsing, Kibana for visualizations, and ILM policies for efficient data retention.

For monitoring beyond logs, see our Prometheus & Grafana guide for metrics visualization. For distributed tracing, see the Jaeger and Distributed Tracing guides for correlating logs with request traces.

Category

Related Posts

Logging Best Practices: Structured Logs, Levels, Aggregation

Master production logging with structured formats, proper log levels, correlation IDs, and scalable log aggregation. Includes patterns for containerized applications.

#observability #logging #monitoring

Alerting in Production: Building Alerts That Matter

Build alerting systems that catch real problems without fatigue. Learn alert design principles, severity levels, runbooks, and on-call best practices.

#data-engineering #alerting #monitoring

Audit Logging: Tracking Data Changes for Compliance

Implement audit logging for compliance. Learn row-level change capture with triggers and CDC, log aggregation strategies, and retention policies.

#database #audit #compliance