Troubleshooting¶
Common problems and their solutions when running HyperbyteDB.
Startup Failures¶
libchdb.so: cannot open shared object file¶
Cause: libchdb is not installed, or it is on disk (often under /usr/local/lib/) but the dynamic linker is not using that path—so the binary fails immediately at process start, sometimes right after a successful cargo build / cargo run link step:
error while loading shared libraries: libchdb.so: cannot open shared object file: No such file or directory
Fix — install and refresh the cache:
Fix — library is installed but the loader still skips /usr/local/lib: if ls /usr/local/lib/libchdb.so succeeds, register that directory with the dynamic linker, then refresh the cache:
Verify: ls /usr/local/lib/libchdb.so and, optionally, ldconfig -p | grep chdb
std::bad_function_call crash on startup¶
Symptom: HyperbyteDB aborts immediately with:
terminate called after throwing an instance of 'std::__1::bad_function_call'
what(): std::bad_function_call
Aborted
Cause: Incompatible system-installed libchdb.so. The crash happens during dynamic library loading, before any Rust code runs.
Fix (Option A — Recommended): Use the chdb-rust bundled library:
# Temporarily move the system libchdb
sudo mv /usr/local/lib/libchdb.so /usr/local/lib/libchdb.so.bak
sudo mv /usr/local/include/chdb.h /usr/local/include/chdb.h.bak
# Rebuild so chdb-rust downloads its own
cargo clean -p chdb-rust && cargo build --release
# Run with the bundled library
LIBCHDB_DIR=$(find target -name "libchdb.so" -path "*/build/chdb-rust-*/out/*" | head -1 | xargs dirname)
LD_LIBRARY_PATH="$LIBCHDB_DIR:$LD_LIBRARY_PATH" ./target/release/hyperbytedb serve
# Optionally restore for other tools
sudo mv /usr/local/lib/libchdb.so.bak /usr/local/lib/libchdb.so
Fix (Option B): Reinstall libchdb from the latest release:
failed to open WAL¶
Cause: Corrupted WAL directory (e.g., unclean shutdown, disk error).
Fix: Restore from a backup, or delete the wal_dir to start fresh (data in the WAL that hasn't been flushed will be lost).
address already in use¶
Cause: Another process is listening on the same port.
Fix: Change the port in config, or stop the conflicting process:
tls_cert_path ... not found¶
Cause: TLS is enabled but certificate files are missing.
Fix: Check the paths in your config, or disable TLS:
Writes Succeed but Queries Return Empty¶
Data must be flushed from the WAL into chDB MergeTree tables before it becomes queryable.
Checklist:
-
Wait for flush. The default flush interval is 10 seconds. Wait at least that long after writing.
-
Check logs for flush errors:
-
Verify chDB data path is writable (configured via
[chdb].session_data_pathinconfig.toml): -
Verify the database exists:
-
Check the measurement exists:
Query Timeouts¶
Symptom: Queries return HTTP 408 or take very long.
Fixes:
-
Increase the timeout:
-
Add a time range to your query. Queries without
WHERE time > ...scan all data. -
Cap concurrent queries. chDB is a process-global singleton (one real session);
chdb.pool_sizeis ignored. Tuneserver.max_concurrent_queriesinstead so heavy queries don't pile up on the single engine session. -
Narrow the time range in your query to reduce scanned data volume.
Cardinality Limit Errors¶
Symptom: Writes return HTTP 422 with cardinality limit exceeded.
Possible causes: - High-cardinality values (UUIDs, timestamps, request IDs) used as tag values. These should be fields instead. - Legitimate growth beyond the configured limits.
Fixes:
-
Investigate the data model. Tags are indexed; use fields for high-cardinality values.
-
Increase limits if the growth is expected:
Field Type Conflict¶
Symptom: Writes return HTTP 400 with field type conflict.
Cause: A field was previously registered with one type (e.g., float) and a new write sends a different type (e.g., integer) for the same field name.
Fix: Ensure all writes use the same type for each field. If the original type was wrong, you need to drop the measurement and recreate it:
Cluster Replication Issues¶
Writes not appearing on peer nodes¶
-
Check connectivity: Ensure all nodes can reach each other on their
cluster_addrand port. -
Check logs for replication errors:
-
Verify peers configuration: The
peerslist should NOT include the node's owncluster_addr. -
Check node states:
Nodes must be inActivestate to accept replicated writes.
Persistent data gaps between nodes¶
Symptoms: One node shows fewer series or buckets than others for the same time range; /internal/sync/manifest responses differ between peers; metrics may show hyperbytedb_replication_lag_wal_seq increasing.
Checklist:
-
Check peer reachability. Sync and replication only contact active members; fix connectivity or heartbeat issues first.
-
Compare manifests across nodes:
-
Verify all peers run the same HyperbyteDB version and compatible
libchdb.so.
Reference: Deep Dive: Clustering.
Split-brain detection¶
Compare membership views across nodes:
curl -s http://node1:8086/cluster/metrics | jq '.membership'
curl -s http://node2:8086/cluster/metrics | jq '.membership'
curl -s http://node3:8086/cluster/metrics | jq '.membership'
If nodes have different membership views, check network partitions and ensure all peers can communicate.
High Memory Usage¶
-
Reduce flush batch size:
-
Tune WAL batching if ingest pressure is high:
See Also¶
- Configuration — All tuning parameters
- Administration — Monitoring and operational procedures