Lab 09: Time-Series Databases

Time: 50 minutes | Level: Architect | DB: TimescaleDB (PostgreSQL extension)


🎯 Objective

Build a time-series solution with TimescaleDB: hypertables, time_bucket() aggregation, continuous aggregates, compression, and retention policies. Compare performance vs plain PostgreSQL.


📚 Background

Why TimeSeries Databases?

Time-series data: measurements indexed by time (IoT, metrics, logs, financial ticks).

Challenges with plain PostgreSQL:

  • Huge write rates (millions of rows/hour)

  • Queries always filter by time range

  • Recent data is hot; old data is cold

  • Aggregations across time windows are common

TimescaleDB solves these with hypertables (automatic time-based partitioning), columnar compression (95%+ ratio), and continuous aggregates.

TimescaleDB vs InfluxDB

Feature
TimescaleDB
InfluxDB

Query language

SQL (PostgreSQL-compatible)

Flux / InfluxQL

Data model

Relational tables

measurement/tags/fields

Joins

Yes (PostgreSQL)

Limited

Open source

Yes (Apache/Timescale)

Yes (core)

Ecosystem

Full PostgreSQL ecosystem

InfluxDB-specific

Best for

SQL-centric teams, relational + timeseries

Pure metrics/IoT

InfluxDB Data Model


Step 1: Start TimescaleDB

📸 Verified Output:


Step 2: Create Hypertable

📸 Verified Output:


Step 3: Insert 1 Million Sensor Rows

📸 Verified Output:


Step 4: time_bucket() Aggregation

📸 Verified Output:


Step 5: Continuous Aggregates

📸 Verified Output:


Step 6: Compression

📸 Verified Output:

💡 95% compression ratio is typical for TimescaleDB with columnar compression. 1 TB of raw time-series data → ~50 GB compressed.


Step 7: Retention Policy & Performance Benchmark

📸 Verified Output:


Step 8: InfluxDB Concepts & Line Protocol

📸 Verified Output:


Summary

Concept
Key Takeaway

Hypertable

create_hypertable() — automatic time-based partitioning

Chunk

Automatic time partition (e.g., 1 day). Query only scans relevant chunks

time_bucket()

GROUP BY time intervals: time_bucket('1 hour', time)

Continuous aggregate

Auto-refreshing materialized view for time-series aggregations

Compression

compress_chunk() — 95%+ compression; compressed chunks still queryable

Retention policy

Auto-drop old chunks: add_retention_policy(interval)

InfluxDB Line Protocol

measurement,tag=val field=val timestamp

Flux

InfluxDB query language: pipeline of `

Benchmark

TimescaleDB: 5x faster than plain PostgreSQL for time-range queries

💡 Architect's insight: TimescaleDB is the right choice when your team knows SQL. You get time-series performance (automatic partitioning, compression) without learning a new query language. InfluxDB excels for pure DevOps metrics with Grafana.

Last updated