Time-Series Databases: Handling Time-Ordered Data

Learn how time-series databases like InfluxDB and TimescaleDB handle time-ordered data, downsampling, retention policies, and IoT analytics.

published: reading time: 19 min read

Time-Series Databases: Storing and Analyzing Time-Ordered Data

Time-series data shows up everywhere. Sensor readings, stock prices, application metrics, user events. Any data where timing matters qualifies. Regular databases handle this data, but they struggle. Time-series databases exist because they are built for this specific access pattern.

I have seen teams try to store IoT sensor data in PostgreSQL, watching their queries slow down as the data grew. The solution is not always more indexes. Sometimes you need a database designed around the temporal nature of your data.

What Makes Data Time-Series

Time-series data has specific characteristics. Each data point includes a timestamp. Data arrives in time order. You typically query by time ranges. Old data is often deleted or downsampled after a period.

The access patterns differ from transactional data. You do not update time-series records. You append new points and query ranges. Aggregations over time windows are common. Downsample high-resolution data into lower-resolution summaries.

> INSERT temperature,location=room1 value=22.5 1700000000000000000
> INSERT temperature,location=room1 value=22.7 1700000001000000000
> INSERT temperature,location=room2 value=23.1 1700000002000000000

The measurement is temperature, tags include location, and the value is the reading. The timestamp is nanoseconds since epoch.

InfluxDB Architecture

InfluxDB is built for time-series workloads. It uses a custom storage engine called the Time-Structured Merge Tree (TSM). Data is compressed and organized by time ranges. This makes time-range queries extremely fast.

InfluxDB uses the concept of measurements, tags, and fields. Tags are indexed and useful for filtering. Fields are not indexed. The combination of measurement name, tag set, and timestamp uniquely identifies a point.

flowchart TD
    subgraph Ingest["Write Path"]
        W["Write Request"]
        WAL["WAL<br/>(Write-Ahead Log)"]
        Cache["Data Cache<br/>(In-Memory)"]
        TSM["TSM Files<br/>(Compressed, Sorted)"]
        Compaction["Compaction"]
    end

    subgraph Query["Read Path"]
        Q["Query Engine"]
        RS["Read Service"]
    end

    subgraph Chunk["Time Chunking"]
        Chunk1["Chunk 1<br/>t=0 to t=1h"]
        Chunk2["Chunk 2<br/>t=1h to t=2h"]
        Chunk3["Chunk 3<br/>t=2h to t=3h"]
    end

    W --> WAL --> Cache -->|"Flush when full"| TSM
    TSM --> Compaction --> TSM
    Q --> RS --> Chunk1 & Chunk2 & Chunk3

InfluxDB organizes data into time chunks — each chunk covers a specific time range (typically 1 hour). Writes hit the WAL first for durability, then go to the in-memory cache. When the cache fills, it flushes to a TSM file. Compaction runs in the background, merging and compressing older TSM files. Queries read only the chunks that overlap with the requested time range, which is why the time-range filter is the most important optimization in InfluxDB.

The write path in sequence:

sequenceDiagram
    participant Client
    participant WAL as Write-Ahead Log
    participant Cache as Data Cache
    participant TSM as TSM File
    participant Compactor

    Client->>WAL: Write point (timestamp, tags, fields)
    WAL->>WAL: Persist to disk immediately
    WAL->>Cache: Pass write to cache
    Cache->>Cache: Index write in memory
    Cache-->>Client: Acknowledge write

    Note over Cache: Cache reaches full threshold
    Cache->>TSM: Flush cache to TSM file
    TSM->>TSM: Compress and sort by time

    Note over Compactor,TSM: Background compaction
    Compactor->>TSM: Merge multiple TSM files
    Compactor->>TSM: Rewrite compacted files

Writes get acknowledged once the WAL persists and the cache indexes them. Not when TSM files are written. That is why InfluxDB is fast on the write side. TSM flushes and compaction are background jobs.

SELECT mean(value) FROM temperature
WHERE time > now() - 7d
GROUP BY time(1h), location

This query aggregates temperature readings into hourly averages, broken down by location. The GROUP BY time syntax is native to InfluxDB and efficient because data is already organized by time.

TimescaleDB: PostgreSQL for Time-Series

TimescaleDB extends PostgreSQL with time-series capabilities. If you know PostgreSQL, you already know TimescaleDB. It adds automatic partitioning by time, chunking your data into time-based partitions.

SELECT time_bucket('1 hour', time) AS hour,
       location,
       avg(value) as avg_temp
FROM temperature
WHERE time > now() - interval '7 days'
GROUP BY hour, location
ORDER BY hour;

The time_bucket function groups by hour. Behind the scenes, TimescaleDB manages the partitions, automatically creating new chunks as time progresses and dropping old ones based on retention policies.

TimescaleDB excels when you want PostgreSQL capabilities with time-series performance. You get JSON support, full-text search, and all PostgreSQL features alongside time-series optimization.

Retention Policies

Time-series databases handle data lifecycle automatically. You define retention policies that determine how long data is kept at different resolutions.

CREATE RETENTION POLICY "one_day" ON "sensor_data"
DURATION 1d
REPLICATION 1
SHARD DURATION 1h

CREATE RETENTION POLICY "one_year" ON "sensor_data"
DURATION 365d
REPLICATION 1
SHARD DURATION 1w

Raw sensor data might be kept for one day at full resolution. Older data is downsampled into hourly summaries and kept for a year. After a year, it is deleted.

This automatic data management is valuable for IoT and monitoring use cases. You ingest high-resolution data for real-time analysis, and older data automatically rolls up into summaries.

Downsampling and Data Aggregation

Downsampling reduces data resolution over time. A million raw points become a thousand hourly averages. Queries run faster on the downsampled data.

SELECT mean(value) AS value
INTO temperature_hourly
FROM temperature
WHERE time > now() - 7d
GROUP BY time(1h), location

This query computes hourly averages and writes them to a new measurement. You would run this as a continuous query that runs periodically, automatically downsampling incoming data.

TimescaleDB uses a similar approach with continuous aggregates:

CREATE MATERIALIZED VIEW temperature_hourly
WITH (timescaledb.continuous) AS
SELECT time_bucket('1 hour', time) AS hour,
       location,
       avg(value) as avg_temp
FROM temperature
GROUP BY hour, location;

The materialized view is automatically refreshed based on a refresh policy.

Use Cases

IoT Sensor Data

IoT deployments generate massive amounts of time-series data. Temperature, humidity, pressure, motion. Sensors report readings every few seconds. You need to store this, query it, and visualize it.

Time-series databases handle the write volume and query patterns. You can store millions of points per second. Queries aggregate across time ranges efficiently.

Application Metrics

Application performance monitoring generates time-series data. Request rates, latency percentiles, error rates, resource utilization. Tools like Prometheus use time-series databases under the hood.

Dashboards query metrics by time range, aggregate across services, and correlate across metric types. Time-series databases support the query patterns monitoring tools need.

Financial Data

Stock prices, trade volumes, order book deltas. Financial data is naturally time-series. High-frequency trading systems generate enormous volumes. Analytics queries aggregate across time windows.

TimescaleDB handles financial data well when you need SQL capabilities alongside time-series optimization. InfluxDB excels for pure write throughput and downsample-heavy workloads.

Query Performance Considerations

Time-series queries have common patterns. Filtering by time range first narrows the data significantly. Then aggregation across that subset.

Indexing strategies differ from transactional databases. The timestamp is the primary access pattern, so time-based partitioning is key. Tag indexes help filter within time ranges.

Compression reduces storage significantly. Time-series data often compresses well because adjacent points are similar. InfluxDB and TimescaleDB both handle compression automatically.

Time-Series Database Comparison

FeatureInfluxDBTimescaleDBQuestDBClickHouse
LicenseMIT (OSS), commercial cloudApache 2 (OSS), TimescaleDB CloudApache 2Apache 2
SQL supportInfluxQL + FluxFull SQL (PostgreSQL)Full SQLFull SQL
Write throughputVery high (100k+ pts/s)High (tied to PostgreSQL)Very high (100k+ pts/s)Very high (100k+ pts/s)
Query performanceFast for time-rangeFast with chunkingVery fast columnarExcellent for analytics
CompressionTSM engine (built-in)PostgreSQL-basedColumnar, highly optimizedBest-in-class columnar
DownsamplingContinuous queriesContinuous aggregatesParititioned by timeMaterialized views + clickhouse-kafka
Retention policiesNative, automaticNative via chunkingManualManual via TTL
EcosystemTelegraf, Flux, cloudPostgreSQL toolingKafka, Telegraf, PythonKafka, Superset, Grafana
Operational complexityLow (single binary)Medium (PostgreSQL-based)Low (single binary)Higher (cluster setup)
Best forPure TSDB workloads, metricsExisting PG teams needing TSDBUltra-low latency, Kafka ingestionAnalytical heavy-lift with time-series

When to Use and When Not to Use Time-Series Databases

When to Use Time-Series Databases:

  • You have high-volume data with timestamps where timing is essential
  • Your queries aggregate data over time windows (hourly averages, daily max/min)
  • You need automatic data retention and downsampling policies
  • You are building IoT sensor monitoring, application performance monitoring, or financial analytics
  • Write throughput is more important than arbitrary querying flexibility
  • You need specialized time-series functions like gap filling, interpolation, or smoothing

When Not to Use Time-Series Databases:

  • Your data has complex relationships requiring multi-table joins
  • You need arbitrary ad-hoc queries across non-time dimensions
  • Your use case is primarily single-record lookups by ID
  • You need strong transactional guarantees across multiple entity types
  • Your data volume is low and a regular relational database handles it fine

Production Failure Scenarios

FailureImpactMitigation
Write throughput exceeding capacityData loss, dropped points, increased latencyImplement buffering, batch writes, scale ingestion horizontally
Retention policy misconfigurationAccidental data loss or excessive storage usageTest policies on small datasets first, implement backups before changes
Compaction backlogQuery performance degrades over timeMonitor compaction progress, tune compaction settings, scale resources
Query timeout on large rangesDashboard timeouts, incomplete data viewsImplement query timeouts, use downsampled data for long ranges, paginate
Timezone handling bugsIncorrect aggregations, data aligned to wrong periodsNormalize to UTC internally, handle timezone conversion at query time
High cardinality tag explosionMemory exhaustion, slow queriesLimit unique tag values, use coarser-grained tags, implement tag cardinality limits
Shard/index corruptionMissing or incorrect query resultsEnable checksums, regular integrity checks, replica verification
Downsampling continuous query failureIncomplete aggregated dataMonitor continuous query status, implement alerting on query errors

Observability Checklist

Metrics to Monitor:

  • Ingest rate (points per second) and compare against capacity
  • Write latency and error rates
  • Query latency by query type and time range
  • Disk usage and storage growth rate
  • Compression ratio (raw vs compressed size)
  • Shard cardinality and partition counts
  • Retention policy execution status
  • Continuous query execution and error rates
  • Query result cache hit ratios (if applicable)

Logs to Capture:

  • Continuous query errors and restarts
  • Shard creation and deletion events
  • Compaction progress and I/O patterns
  • Authentication and authorization failures
  • Subscription status changes
  • Query engine errors and timeouts

Alerts to Set:

  • Ingest rate approaching capacity limits
  • Write latency exceeding SLA thresholds
  • Disk usage > 80% or growth rate anomaly
  • Query latency spike (p99 > threshold)
  • Continuous query failures
  • Replication lag > threshold
  • Compactions falling behind
-- TimescaleDB: Check chunk status
SELECT show_chunks('temperature', older_than => INTERVAL '7 days');
SELECT hypertable_detailed_size('temperature');
SELECT * FROM timescaledb_information.jobs;

-- InfluxDB: Check system stats
SHOW STATS;
SHOW DIAGNOSTICS;

Security Checklist

  • Enable authentication and use strong credentials
  • Implement TLS for all client and inter-node connections
  • Use role-based access control to limit database and measurement access
  • Enable audit logging for sensitive operations
  • Encrypt data at rest (filesystem or volume encryption)
  • Implement network segmentation (not directly internet-accessible)
  • Sanitize user inputs to prevent injection via query parameters
  • Use secure credential storage (not hardcoded in configs)
  • Regularly rotate credentials and review access patterns
  • Test restore procedures and backup integrity

Common Pitfalls and Anti-Patterns

  1. High cardinality tag values: Using user IDs, email addresses, or other high-uniqueness values as tags causes memory issues. Tags should have bounded cardinality (thousands, not millions of unique values).

  2. Not using bulk writes: Writing point-by-point is extremely inefficient. Batch writes into groups of 1000-5000 points for optimal throughput.

  3. Ignoring retention policies: Without proper retention, storage grows unbounded. Define retention upfront and test that it actually deletes data.

  4. Querying too wide time ranges: Queries spanning years cause timeouts. Use downsampled data for long ranges, only query raw data for recent windows.

  5. Storing non-time-series data: Forcing unrelated data into a time-series model creates awkward schemas. Use the right tool for each data type.

  6. Not understanding compaction behavior: Compactions cause I/O spikes and temporary query slowdowns. Schedule intensive compactions during off-peak hours.

  7. Underestimating downsample precision loss: Aggregated data loses granularity. Ensure your downsampling strategy preserves the precision your queries need.

  8. Forgetting about timezone handling: Storing local times without timezone info causes subtle bugs when aggregating across regions.

Capacity Estimation: Retention, Partitioning, and Write Throughput

Sizing a time-series database starts with write throughput. InfluxDB handles roughly 100k-500k points per second per node depending on hardware and compression. If you have 10,000 sensors reporting every 5 seconds, that is 2,000 points per second — well within a single node’s capacity. If you have 1 million meters reporting every second, that is 1 million points per second, which needs a cluster.

The storage formula: points per second times average point size times retention duration. A sensor sending 10 fields at 50 bytes per point, 1 point per second, kept for 90 days: 1 50 90 * 86400 = 388MB per sensor. Scale that to 100,000 sensors and you need roughly 38.8GB of raw storage. With compression (InfluxDB typically achieves 3-10x compression on time-series data), that shrinks to 4-13GB.

TimescaleDB chunk sizing matters for performance. TimescaleDB splits data into chunks by time interval. The chunk interval determines how much data sits in each partition. Too-large chunks defeat the purpose of partitioning — queries scan too much data. Too-small chunks create too many partitions and cause overhead. The rule of thumb: target 1-10 chunks per query for the time ranges you typically query. For 90-day retention querying data in 7-day windows, hourly chunks (2160 chunks total, ~4 per query on average) work well.

For TimescaleDB, estimate memory as: chunk_size * active_chunks_in_memory * row_width. If each row is 100 bytes and you keep 7 days in memory across 100 concurrent queries, that is roughly 100 7 86400 100 100 = 6.9GB. Add 30% for index overhead if you have indexes on tag columns.

InfluxDB TSM file sizing: each TSM file covers a shard group time range. With 1-hour shard duration and 7-day retention, each TSM file holds 1 hour of data. For 10,000 points per second, that is 36 million points per TSM file. At 50 bytes per point compressed, each TSM file is roughly 1.8GB compressed on disk. Monitor diskdatan and diskn metrics to track actual sizes.

Real-World Case Study: Grafana Labs and Prometheus at Scale

Grafana Labs runs one of the largest publicly known Prometheus/metrics deployments in the world. Their use case: storing and querying metrics from millions of active series across thousands of customers, with users querying across time ranges from minutes to months.

Their problem was not ingestion — Prometheus handles that reasonably well. The problem was long-range queries on large datasets. A query spanning 30 days across 10,000 series generates millions of data points. Running this on vanilla Prometheus caused query timeouts and OOM kills on the servers.

Their solution was aggressive downsampling and a tiered storage approach. They stored raw metrics for 2 hours (full resolution), then continuously downsampled to 1-minute resolution for 2 weeks, then to 1-hour resolution for 90 days. Queries for recent data hit the raw store and return in milliseconds. Queries for longer ranges hit the downsampled data and return in seconds instead of timing out.

The operational insight: they built tooling to automatically determine the right downsampling intervals based on query patterns. If a metric is never queried beyond 7 days, they do not keep it at full resolution past 7 days. If a metric is queried at 30-day resolution, they downsample it to 1-hour granularity at ingestion time.

The lesson: time-series databases handle ingestion efficiently, but query performance over long ranges requires upfront planning of your downsampling strategy. The best time-series database cannot save a query pattern that requires scanning too much uncompressed data. Design your retention and downsampling around your actual query patterns, not just your data volume.

Interview Questions

Q: You are designing a time-series database for 50,000 IoT sensors. Each sensor sends 1 reading per minute, with 10 fields per reading. You need 90-day retention. How do you size storage and choose between InfluxDB and TimescaleDB?

Storage math first: 50,000 sensors 1/minute 60 minutes 24 hours 90 days = 6.48 billion points. At roughly 50 bytes per point uncompressed, that is 324GB raw. With 5x compression (typical for time-series), you are at roughly 65GB on disk. That fits easily on a single node for both InfluxDB and TimescaleDB.

For the technology choice: if your team knows PostgreSQL and needs features like full-text search, JSON support, or complex joins across sensor metadata, TimescaleDB is a natural fit. If you need maximum write throughput and are comfortable with InfluxQL or Flux, InfluxDB handles this write volume easily. The deciding factor is usually team expertise and whether you need PostgreSQL compatibility. At 50,000 sensors and 1 reading per minute, you are well within single-node capacity for either.

Q: Your InfluxDB queries are timing out on 30-day range queries even though you have retention for 90 days. What is happening and how do you fix it?

The most likely cause is that you are querying raw data across too wide a range. If you have 1-minute resolution data kept for 90 days and you query 30 days at a time, you are scanning 43,200 minutes of data per series. If you have thousands of series in a single query, that is tens of millions of points to decompress and aggregate.

The fix is downsampling. Create a continuous query that rolls up your 1-minute data into 1-hour or 1-day aggregates. Query the downsampled data for long ranges and only hit raw data for recent windows. Alternatively, if you cannot downsample, check whether your tag set is causing high cardinality — too many unique tag values force InfluxDB to scan more series than necessary even within a time range.

Q: What is the difference between InfluxDB’s TSM engine and TimescaleDB’s chunking approach?

InfluxDB’s TSM engine organizes data by time into TSM files, each covering a shard group time range. Within each TSM file, data is sorted by measurement, tag set, and field, then compressed. Queries read only the TSM files that overlap with the requested time range. TimescaleDB’s chunking is similar but built on PostgreSQL table partitioning — each chunk is a separate PostgreSQL table, and the TimescaleDB hypertable is a view over all chunks. Queries against a time range hit only the relevant chunks, and the PostgreSQL query planner handles it.

The operational difference: TimescaleDB chunks are PostgreSQL tables, so you can query them with standard SQL, join them to other PostgreSQL tables, and use PostgreSQL tooling. InfluxDB TSM files are proprietary and opaque — you cannot query them directly and must use InfluxQL or Flux.

Q: How do you handle timezone-aware queries in time-series data?

Store all timestamps in UTC internally. This eliminates a whole class of bugs where local time zones cause aggregation misalignment. Most time-series databases store timestamps in UTC by default. At query time, use the timezone parameter in your query language to convert to the display timezone. In InfluxQL: GROUP BY time(1h, TZ('America/New_York')). In TimescaleDB: time_bucket('1 hour', time AT TIME ZONE 'UTC' AT TIME ZONE 'America/New_York'). Never store local time zones in the database — it creates edge cases around daylight saving transitions that are painful to debug.

Quick Recap

Key Bullets:

  • Time-series databases are purpose-built for timestamped data with time-range query patterns
  • InfluxDB excels at high write throughput with native downsampling
  • TimescaleDB extends PostgreSQL with time-series capabilities for teams familiar with SQL
  • Retention policies automate data lifecycle management
  • Downsampling preserves query performance as data ages
  • Tag cardinality is the most common scalability bottleneck

Copy/Paste Checklist:

-- TimescaleDB: Create continuous aggregate for downsampling
CREATE MATERIALIZED VIEW temperature_hourly
WITH (timescaledb.continuous) AS
SELECT time_bucket('1 hour', time) AS hour,
       location,
       avg(value) as avg_temp
FROM temperature
GROUP BY hour, location;

-- InfluxDB: Create retention policy and continuous query
CREATE RETENTION POLICY "one_day" ON "sensor_data"
DURATION 1d REPLICATION 1 SHARD DURATION 1h;

CREATE CONTINUOUS QUERY "downsample_1h" ON "sensor_data"
BEGIN SELECT mean(value) INTO "temperature_hourly"
FROM "temperature" GROUP BY time(1h), location END;

-- InfluxQL: Query with time zone handling
SELECT mean(value) FROM temperature
WHERE time >= '2024-01-01T00:00:00Z'
  AND time < '2024-01-02T00:00:00Z'
  GROUP BY time(1h), location TZ('America/New_York');

Conclusion

Time-series databases solve problems regular databases handle poorly. Purpose-built storage engines, automatic data lifecycle management, and time-aware query optimization make them the right choice for sensor data, metrics, and financial data.

InfluxDB offers high write throughput and native downsampling. TimescaleDB brings these capabilities to PostgreSQL users who want SQL compatibility. Both handle the time-range query patterns that time-series workloads require.

If you are storing IoT data, metrics, or any high-volume time-stamped data in a regular database, consider whether a time-series database would simplify your architecture.

For related reading, see NoSQL Databases to understand other NoSQL options, and Database Scaling to learn about scaling strategies for high-volume data.

If you are building pipelines that feed TSDBs, Apache Kafka handles event streaming, and Apache Flink does stateful stream processing — useful for pre-aggregating data before it lands in your time-series store.

Category

Related Posts

DuckDB: The SQLite for Analytical Workloads

Explore DuckDB, the in-process analytical database that runs anywhere, handles columnar storage efficiently, and brings analytics to where your data lives.

#data-engineering #duckdb #olap

Joins and Aggregations: SQL Patterns for Data Analysis

Master SQL joins and aggregation techniques for building efficient analytical queries in data warehouses and analytical databases.

#data-engineering #sql #joins

Column-Family Databases: Cassandra and HBase Architecture

Cassandra and HBase data storage explained. Learn partition key design, column families, time-series modeling, and consistency tradeoffs.

#database #nosql #column-family