Advanced features¶

Clustering, continuous queries, TLS, tracing, and optional columnar ingest. See Authentication for credential setup.

Clustering¶

HyperbyteDB supports master-master replication where every node in the cluster accepts both reads and writes. Data writes are replicated asynchronously to all peers. Schema mutations (CREATE/DROP DATABASE, DELETE, etc.) use Raft consensus for consistent ordering.

How it works¶

A client writes to any node.
That node persists to its local WAL and returns 204 to the client.
The node fans out the write via HTTP to all peers.
A header (X-Hyperbytedb-Replicated: true) prevents infinite replication loops.

Configuration¶

Each node needs a unique node_id and must list all other nodes as peers:

Node 1:

[cluster]
enabled = true
node_id = 1
cluster_addr = "node1.example.com:8086"
peers = "node2.example.com:8086,node3.example.com:8086"

Node 2:

[cluster]
enabled = true
node_id = 2
cluster_addr = "node2.example.com:8086"
peers = "node1.example.com:8086,node3.example.com:8086"

No initialization step is required. Nodes begin replicating as soon as they start.

Cluster endpoints¶

Endpoint	Method	Description
`/cluster/metrics`	GET	Cluster mode, node address, peer list
`/cluster/nodes`	GET	All nodes with a `self` flag

Important considerations¶

Replication is asynchronous and best-effort. If a peer is down, the write succeeds locally but the peer misses it until anti-entropy or hinted handoff delivers it.
There is no distributed query fan-out. Each node queries its own embedded chDB tables.
Schema mutations (CREATE DATABASE, DROP DATABASE, DELETE, etc.) are replicated via Raft for consistent ordering.
Hinted handoff stores writes for unreachable peers and replays them on reconnection.

Continuous Queries¶

Continuous queries (CQs) automatically downsample data on a schedule, writing aggregated results to a target measurement.

Create a continuous query¶

CREATE CONTINUOUS QUERY "cq_cpu_1h" ON "mydb"
BEGIN
  SELECT mean("usage_idle") AS "mean_usage_idle"
  INTO "cpu_1h"
  FROM "cpu"
  GROUP BY time(1h), *
END

With resample control¶

CREATE CONTINUOUS QUERY "cq_cpu_1h" ON "mydb"
RESAMPLE EVERY 30m FOR 2h
BEGIN
  SELECT mean("usage_idle") INTO "cpu_1h" FROM "cpu" GROUP BY time(1h), *
END

RESAMPLE EVERY 30m — execute every 30 minutes (instead of every resample_every_secs default).
FOR 2h — look back 2 hours on each execution.

Manage continuous queries¶

SHOW CONTINUOUS QUERIES
DROP CONTINUOUS QUERY "cq_cpu_1h" ON "mydb"

CQ definitions are stored in metadata and survive restarts. The background scheduler evaluates CQs every 10 seconds.

Materialized Views¶

Materialized views downsample data incrementally using ClickHouse MATERIALIZED VIEW objects. Each new flush to the source measurement triggers aggregation into the destination measurement — no background scheduler.

Create a materialized view¶

CREATE MATERIALIZED VIEW "mv_cpu_1h" ON "mydb"
AS SELECT mean("usage_idle") INTO "cpu_1h" FROM "cpu" GROUP BY time(1h), *

Requirements (same as SELECT INTO / continuous queries):

INTO destination measurement is required
GROUP BY time(<interval>) is required
Single concrete source measurement (no regex FROM in v1)

On CREATE, HyperbyteDB:

Registers the destination measurement in metadata
Creates destination MergeTree tables in chDB
Installs fact and series ClickHouse materialized views
Backfills historical data from the source

Manage materialized views¶

SHOW MATERIALIZED VIEWS
DROP MATERIALIZED VIEW "mv_cpu_1h" ON "mydb"

DROP MATERIALIZED VIEW removes the ClickHouse MV objects and metadata. The destination measurement and its data are kept.

Continuous queries vs materialized views¶

	Continuous Query	Materialized View
Trigger	10s scheduler	Each flush to source
Latency	Up to resample interval	Near real-time
Backfill	Re-scans window each run	One-time on CREATE; then incremental
Engine	WAL writeback	ClickHouse MV

User authentication¶

Enable [auth] enabled = true to require credentials on /write and /query. How credentials are sent, which HTTP paths stay public, and when admin is required for cluster/internal APIs are all covered in Authentication.

TLS / HTTPS¶

Enable encrypted connections:

[server]
tls_enabled = true
tls_cert_path = "/etc/hyperbytedb/cert.pem"
tls_key_path = "/etc/hyperbytedb/key.pem"

HyperbyteDB uses rustls (no OpenSSL dependency). Both files must be PEM-encoded. When TLS is enabled, the server only accepts HTTPS connections — there is no mixed HTTP+HTTPS mode.

Generate a self-signed certificate (testing only)¶

openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem \
  -days 365 -nodes -subj '/CN=hyperbytedb'

Columnar MessagePack Ingest¶

Columnar MessagePack ingest is enabled in default builds (columnar-ingest feature). It accepts columnar data encoded as MessagePack, reducing parse overhead compared to line protocol for bulk imports.

Disable with cargo build --no-default-features if you need a slimmer binary.

This adds support for Content-Type: application/x-msgpack on the /write endpoint.

Observability (logs and traces)¶

[logging]
level = "info"
format = "json"
detailed_trace = true
otlp_endpoint = "http://alloy:4318"
otlp_sample_ratio = 0.1

format = "json" — structured logs for Loki and similar collectors.
detailed_trace = true — per-phase spans on write, query, and flush paths.
otlp_endpoint — export traces to Tempo (or any OTLP HTTP collector).

The Docker Compose stack enables these settings by default. See Administration for Grafana workflows.

Statement summary¶

[statement_summary]
enabled = true
max_entries = 1000

curl -sS 'http://localhost:8086/api/v1/statements'

Each entry includes the normalized query text, digest, execution time, and error status.