What Is Time-Series Data?
Time-series data is a sequence of data points indexed by time. Examples include server metrics (CPU, memory), IoT sensor readings, financial tick data, application logs, and user behavior events. Time-series databases are optimized for high write throughput, efficient storage, and fast range queries over time intervals.
Characteristics of Time-Series Workloads
| Characteristic | Typical Requirement |
|----------------|---------------------|
| Write throughput | Millions of data points per second |
| Data immutability | Appends only, rarely updated |
| Time-ordered queries | "Last 24 hours," "Average per minute this week" |
| Retention policies | Automatic data expiration |
| Downsampling | Roll up old data to lower resolution |
Time-Series DB Comparison
| Feature | InfluxDB | TimescaleDB | ClickHouse |
|---------|----------|-------------|------------|
| Architecture | Custom TS engine | PostgreSQL extension | Columnar OLAP |
| Query language | Flux / SQL | SQL (full) | SQL (custom dialect) |
| Write throughput | Very high | High | Very high |
| Compression | Excellent | Good (native) | Excellent |
| SQL compatibility | Partial | Full PostgreSQL | ClickHouse SQL |
| Joins | Limited | Full SQL joins | Limited |
| Deployment | Standalone | PostgreSQL-based | Standalone |
InfluxDB
InfluxDB is a purpose-built time-series database with automatic downsampling and retention policies.
Data Model
// Measurement: cpu
// Tags (indexed): host, region
// Fields (values): usage_user, usage_system, usage_idle
// Timestamp: automatic or explicit
cpu,host=server01,region=us-east1 usage_user=45.2,usage_system=12.1 1715385600
cpu,host=server01,region=us-east1 usage_user=47.8,usage_system=13.4 1715385601
cpu,host=server02,region=us-west1 usage_user=32.1,usage_system=8.9 1715385600
Writing Data
from influxdb_client import InfluxDBClient
client = InfluxDBClient(url="http://localhost:8086", token="my-token")
write_api = client.write_api()
# Write a single point
from influxdb_client.client.write_api import SYNCHRONOUS
p = Point("cpu") \
.tag("host", "server01") \
.tag("region", "us-east1") \
.field("usage_user", 45.2) \
.field("usage_system", 12.1) \
.time(datetime.utcnow())
write_api.write(bucket="metrics", record=p)
# Batch write for performance
points = []
for metric in metrics_batch:
points.append(
Point("sensor") \
.tag("device_id", metric.device_id) \
.field("temperature", metric.temp) \
.field("humidity", metric.humidity) \
.time(metric.timestamp)
)
write_api.write(bucket="iot", record=points)
Querying
-- InfluxDB SQL (v3+)
SELECT
time,
mean(usage_user) AS avg_cpu
FROM cpu
WHERE
host = 'server01'
AND time >= now() - interval '1 hour'
GROUP BY time(1m)
ORDER BY time;
-- Downsampling with continuous query
CREATE CONTINUOUS QUERY cpu_1h ON mydb
BEGIN
SELECT mean(usage_user) AS usage_user
INTO cpu_1h
FROM cpu
GROUP BY time(1h), host
END;
TimescaleDB
TimescaleDB extends PostgreSQL with hypertables that transparently partition data by time.
Setup
CREATE EXTENSION IF NOT EXISTS timescaledb;
-- Create a regular table
CREATE TABLE sensor_data (
time TIMESTAMPTZ NOT NULL,
device_id INTEGER NOT NULL,
temperature DOUBLE PRECISION,
humidity DOUBLE PRECISION,
pressure DOUBLE PRECISION
);
-- Convert to hypertable
SELECT create_hypertable('sensor_data', 'time');
-- Add compression policy
ALTER TABLE sensor_data SET (
timescaledb.compress,
timescaledb.compress_segmentby = 'device_id'
);
SELECT add_compression_policy('sensor_data', INTERVAL '7 days');
Querying
-- Full SQL compatibility
SELECT
time_bucket('5 minutes', time) AS bucket,
device_id,
AVG(temperature) AS avg_temp,
MAX(temperature) AS max_temp,
MIN(temperature) AS min_temp
FROM sensor_data
WHERE
time > NOW() - INTERVAL '24 hours'
AND device_id = 42
GROUP BY bucket, device_id
ORDER BY bucket DESC;
-- Continuous aggregate (pre-computed rollup)
CREATE MATERIALIZED VIEW sensor_hourly
WITH (timescaledb.continuous) AS
SELECT
time_bucket('1 hour', time) AS bucket,
device_id,
AVG(temperature) AS avg_temperature
FROM sensor_data
GROUP BY bucket, device_id;
-- Real-time aggregation (includes recent un-materialized data)
SELECT * FROM sensor_hourly
WHERE bucket > NOW() - INTERVAL '7 days'
ORDER BY bucket DESC
LIMIT 100;
When to Use TimescaleDB
ClickHouse
ClickHouse is a columnar OLAP database optimized for real-time analytics on large datasets.
Schema Design
CREATE TABLE sensor_data (
timestamp DateTime64(3),
device_id UInt32,
temperature Float32,
humidity Float32,
pressure Float32,
location String
) ENGINE = MergeTree()
ORDER BY (device_id, timestamp)
TTL timestamp + INTERVAL 90 DAY DELETE;
-- Distributed version
CREATE TABLE sensor_data_distributed AS sensor_data
ENGINE = Distributed('cluster', 'default', 'sensor_data', rand());
Querying
-- ClickHouse's strength: fast aggregations on massive datasets
SELECT
toStartOfMinute(timestamp) AS minute,
device_id,
avg(temperature) AS avg_temp,
quantile(0.95)(temperature) AS p95_temp,
quantile(0.99)(temperature) AS p99_temp
FROM sensor_data
WHERE timestamp > now() - INTERVAL 7 DAY
GROUP BY minute, device_id
ORDER BY minute DESC
LIMIT 100;
-- Using materialized views for pre-aggregation
CREATE MATERIALIZED VIEW sensor_stats_hourly
ENGINE = AggregatingMergeTree()
ORDER BY (device_id, hour)
AS SELECT
device_id,
toStartOfHour(timestamp) AS hour,
avgState(temperature) AS avg_temp,
maxState(temperature) AS max_temp
FROM sensor_data
GROUP BY device_id, hour;
Performance Comparison (1B rows)
| Benchmark | InfluxDB | TimescaleDB | ClickHouse |
|-----------|----------|-------------|------------|
| Write speed (rows/s) | 1.2M | 850K | 1.5M |
| Storage (compressed) | 35 GB | 65 GB | 22 GB |
| Last 24h range query | 210ms | 380ms | 120ms |
| Aggregation (1 week) | 1.2s | 2.1s | 0.4s |
| Memory usage | 2 GB | 4 GB | 6 GB |
Choosing the Right Database
| Use Case | Recommended |
|----------|-------------|
| Infrastructure monitoring (Prometheus compatible) | InfluxDB |
| IoT sensor data with relational context | TimescaleDB |
| Real-time analytics on billions of rows | ClickHouse |
| Simple metrics and events | InfluxDB |
| Full SQL + time-series hybrid | TimescaleDB |
| Log analytics and observability | ClickHouse |
Summary
Time-series databases optimize for append-heavy, time-ordered data with high compression and fast range queries. Choose InfluxDB for purpose-built time-series with minimal operational overhead, TimescaleDB when you need full SQL compatibility and relational joins with your time-series data, and ClickHouse for real-time analytics on massive datasets. All three support automatic downsampling, retention policies, and continuous aggregates to manage data at scale.