Skip to content

Core Modules

A module-by-module guide to the HyperbyteDB source code under src/.


Root Files

File Purpose
main.rs CLI entry point (clap). Subcommands: serve, backup, restore. Initializes tracing, calls build_services, starts Axum server, spawns background tasks, handles graceful shutdown.
lib.rs Library root. Re-exports: adapters, application, bootstrap, cluster, config, domain, error, influxql, ports.
bootstrap.rs Composition root. build_services() wires WAL, metadata, storage, chDB, auth, cluster, and all services into AppState. Returns BootstrappedApp.
config.rs HyperbytedbConfig and nested structs. Loaded via Figment (TOML + HYPERBYTEDB__ env vars). Sections: server, storage (including S3), flush, compaction, chdb, auth, cardinality, cluster (including cluster.replication and Raft tunables), logging, statement_summary, hinted_handoff, rate_limit.
error.rs HyperbytedbError enum with variants for every error category (domain, WAL, storage, chDB, metadata, auth, cardinality, cluster, replication, internal). From impls for common error types.

domain/ — Core Data Model

Pure types with no I/O dependencies.

File Key Types Description
point.rs Point, FieldValue The core data point: measurement, tags (sorted BTreeMap), fields (BTreeMap), timestamp (nanoseconds). FieldValue has five variants: Float(f64), Integer(i64), UInteger(u64), String, Boolean.
series.rs SeriesKey Measurement + sorted tags, forming a canonical series identity.
database.rs Database, RetentionPolicy, Precision Database definition with retention policies. Precision enum for timestamp interpretation.
measurement.rs MeasurementMeta Per-measurement schema: field names → type discriminants, tag key list.
wal.rs WalEntry WAL record: database, retention_policy, points, origin_node_id.
query_result.rs QueryResponse, StatementResult, SeriesResult InfluxDB v1-compatible JSON response structure.
storage_layout.rs StorageLayout Path helpers for Parquet file layout: {db}/{rp}/{meas}/{YYYY-MM-DD}/{HH}.parquet. Glob pattern generation for chDB.
user.rs StoredUser Username, password hash (Argon2 PHC format), admin flag.
continuous_query.rs ContinuousQueryDef CQ metadata: name, database, query text, resample intervals, creation timestamp.
parquet_columns.rs TAG_COL_PREFIX, ParquetColumnMapping Tag/field name collision handling for Parquet/ClickHouse column names.

ports/ — Trait Boundaries

Hexagonal architecture interfaces. Each trait is Send + Sync with #[async_trait].

File Trait Key Methods
ingestion.rs IngestionPort ingest(db, rp, precision, body, format)
query.rs QueryPort, QueryService execute_sql(sql), execute_query(db, query, epoch)
storage.rs StoragePort, ParquetStreamSource write_parquet, read_parquet, list_parquet_files, delete_parquet, prepare_parquet_stream
wal.rs WalPort append, read_from, read_range, truncate_before, last_sequence
metadata.rs MetadataPort Database/measurement/tag/field/Parquet registry/user/tombstone/CQ CRUD. metadata_generation() for cache invalidation.
auth.rs AuthPort authenticate(username, password)

application/ — Business Logic

File Key Types Description
ingestion_service.rs IngestionServiceImpl Single-node ingest: parse line protocol → validate metadata → check cardinality → WAL append.
peer_ingestion_service.rs PeerIngestionService Cluster ingest: wraps base service, adds PeerClient::replicate_write fan-out with origin_node_id.
line_protocol.rs parse_line_body_to_points, encode_points_to_line_protocol Line protocol parsing/encoding helpers.
msgpack_ingest.rs parse_msgpack_body_to_points MessagePack array-of-points ingest.
columnar_msgpack.rs ColumnarMsgpackBatch Feature-gated columnar MessagePack ingest for high-throughput bulk imports.
ingest_metadata.rs IngestCardinalityLimits, IngestSchemaCache Hot-path schema/tag caching and cardinality enforcement.
query_service.rs QueryServiceImpl Full InfluxQL execution: parse → dispatch (SHOW/DDL/SELECT) → translate → chDB execute → format results. Handles regex measurements, subqueries, chunked execution, tombstone injection, SLIMIT/SOFFSET.
peer_query_service.rs PeerQueryService Cluster query: wraps QueryService, routes schema mutations through Raft or peer broadcast.
flush_service.rs FlushServiceImpl Background WAL → Parquet flush. Groups by (db, rp, meas), partitions by hour, memory-aware batching, parallel CPU (spawn_blocking) + I/O (tokio::spawn). Cluster-aware WAL truncation.
compaction_service.rs CompactionService Periodic merge of small Parquet files. Orchestrates time-bucket grouping, delegates to compaction_merge, handles output registration/cleanup, orphan cleanup, compact_all with bounded concurrency. In cluster mode: verified compaction + membership self-repair.
compaction_merge.rs merge_parquet_files Arrow-only Parquet merge. Default: streaming k-way merge by time (min-heap). Fallback: full concat + sort. Schema unification with type widening. Size-based output splitting.
retention_service.rs RetentionService Deletes Parquet files past retention policy duration. Runs every 60 seconds.
continuous_query_service.rs ContinuousQueryService Background CQ scheduler. Evaluates all CQs every 10 seconds, executes due queries via QueryService.
replication_apply.rs ReplicationApplyQueue Bounded parallel apply of replicated line protocol to WAL. Returns 503 when full.
statement_summary.rs StatementSummary Ring buffer tracking recently executed statements with digest, latency, error status.

adapters/ — Infrastructure Implementations

adapters/http/

File Description
router.rs AppState (all ports + cluster + raft + replication queue), build_router() with all routes.
write.rs POST /write handler. Cluster state check, gzip decompression, content-type detection.
query.rs GET/POST /query handler. Parameter merging, bind param substitution, statement summary tracking, streaming/chunked response.
ping.rs /ping (liveness), /health (readiness). Cluster-aware state reporting.
metrics.rs /metrics Prometheus scrape endpoint.
auth_middleware.rs Credential extraction (query params → Basic → Token).
middleware.rs Version and Request-Id response headers.
error.rs HyperbytedbError → HTTP status code mapping.
response.rs InfluxDB v1-compatible JSON error formatting.
statements.rs GET /api/v1/statements for statement summary.
cluster.rs /cluster/metrics, /cluster/nodes.
peer_handlers.rs Internal cluster endpoints: /internal/replicate, /internal/replicate-mutation, /internal/membership/*, /internal/sync/*, /internal/bucket-hash, /internal/drain.
raft_handlers.rs Raft HTTP transport: /internal/raft/vote, /internal/raft/append, /internal/raft/snapshot, /internal/raft/membership.

adapters/chdb/

File Description
query_adapter.rs ChdbQueryAdapter — single lazy Session wrapped in Arc<Mutex>. Executes SQL via spawn_blocking. Graceful handling of missing-file errors.
pool.rs ChdbPool — round-robin session pool with optional query semaphore. Each slot has its own data directory.

adapters/storage/

File Description
parquet_writer.rs build_schema, points_to_record_batch, write_parquet, points_to_parquet. ZSTD compression, 65K row groups, page-level statistics.
local.rs LocalStorageStoragePort over local filesystem with tokio::fs.
s3.rs S3StorageStoragePort over object_store AmazonS3.
layout.rs Re-exports StorageLayout from domain.

adapters/wal/

File Description
rocksdb_wal.rs RocksDbWal — bincode serialization, atomic sequence counter, big-endian keys, legacy format migration.
batching_wal.rs BatchingWal — channel-based group commit wrapper over any WalPort.

adapters/metadata/

File Description
rocksdb_meta.rs RocksDbMetadata — JSON-serialized values, in-memory caches (Parquet registry with generation counter, users, tombstones, CQ list). Key patterns: db:*, meas:*:*, tag_val:*:*:*:*, pq:*:*:*:*, user:*, tombstone:*:*:*, cq:*:*.

adapters/auth.rs

MetadataAuthAdapterAuthPort via MetadataPort + Argon2 verification with a short TTL cache.


influxql/ — InfluxQL Engine

File Description
ast.rs Full AST: Statement enum, SelectStatement, Expr (recursive expression tree), FunctionCall, GroupBy, FillOption, Measurement, Duration, etc.
parser.rs Hand-rolled recursive descent parser. Precedence-climbing expression parser. Dispatches on first keyword: SELECT, SHOW, CREATE, DROP, DELETE, ALTER, SET, GRANT, REVOKE.
to_clickhouse.rs AST → ClickHouse SQL. Handles aggregate/transform function mapping, file() table function, time bucket (toStartOfInterval), fill modes (WITH FILL / INTERPOLATE), regex (match()), tombstone injection.
digest.rs stmt_type, fingerprint — normalized query text + SHA-256 for statement summary deduplication.

cluster/ — Clustering Subsystem

File Key Types Description
bootstrap.rs ClusterBootstrap Init replication log, membership, peer client. start_raft opens RaftStore + initializes cluster. run_startup_sync handles join/reconnect.
membership.rs NodeState, NodeInfo, ClusterMembership, SharedMembership Node state machine (Joining→Syncing→Active→Disconnected/Draining→Leaving). Thread-safe shared membership via Arc<RwLock>.
peer_client.rs PeerClient, OutboundReplicationBatch Outbound replication: bounded queue, coalescing, fan-out POST, WAL ack tracking, hinted handoff on failure.
replication_log.rs ReplicationLog RocksDB-backed per-peer WAL/mutation ack tracking. Safe truncation via min_wal_ack().
replication_wire.rs Content-type constants, ReplicationHintPayload Wire format for replication and hinted handoff binary envelope.
hinted_handoff.rs HintedHandoff Per-peer queued hints in RocksDB for unreachable peers.
sync.rs SyncManifest, ParquetFileManifest, VerifyRequest/Response Cross-node manifest and verify types. origin_node_id for provenance.
sync_client.rs SyncClient Join + full sync, reconnect sync (gap analysis: 0 / small / large), delta sync, WAL catch-up.
heartbeat.rs run_heartbeat_updater /ping probes → membership heartbeat updates.
merkle.rs MerkleTreeSummary, MerkleBucket, MerkleCache Anti-entropy trees over time buckets. SHA-256 hashing. Generation-gated cache.
data_hash.rs BucketHashResponse, compute_bucket_hash Canonical content hash for verified compaction.
drain.rs DrainService Graceful drain: flush WAL, wait acks, Merkle verify, notify peers, set Leaving.
types.rs MutationRequest, MutationReplicateRequest/Response Schema mutation wire model.

cluster/raft/

File Description
mod.rs TypeConfig for OpenRaft, HyperbytedbRaft type alias.
types.rs ClusterRequest (SetNodeState, SchemaMutation), ClusterResponse.
log_store.rs RaftStore — RocksDB with meta, logs, state column families. Implements OpenRaft storage traits.
network.rs Network — HTTP JSON RPC to /internal/raft/*.
state_machine.rs StateMachineData — applies committed ClusterRequest to metadata and membership.

bin/ — Additional Binaries

File Description
hyperbytedb_debug.rs Cluster inspection CLI. Subcommands: status, topology, health, replication, raft, manifest, metrics, diff, compact. Colored table output with replication lag highlighting.
hyperbytedb_backfill.rs InfluxDB 1.x → HyperbyteDB migration. Chunked SELECT from Influx, optional disk dump, POST /write to HyperbyteDB. Supports --from-dir for line protocol replay.

See Also