Top.Mail.Ru
Setting up a ClickHouse DBMS Cluster in Docker with Replication Support
CTRL+K

Setting up a ClickHouse DBMS Cluster in Docker with Replication Support

In this article
  • Setting up a ClickHouse DBMS Cluster in Docker with Replication Support
  • Starting Docker Services for the ClickHouse Cluster
  • Securing Inter-Node Communication

Configuring a ClickHouse cluster with replication support addresses the challenges of fault tolerance, scalability, and data integrity in a distributed analytical system. Replication enables data duplication across multiple nodes, ensuring data availability even if individual servers fail. Additionally, cluster mode allows load distribution among nodes, improving overall query performance and system resilience under peak loads.

Starting Docker Services for the ClickHouse Cluster

Important

To ensure data consistency in a ClickHouse cluster using replicated tables, the cluster must contain an odd number of nodes.

Before starting Docker services on each cluster node, perform the same setup steps as for non-clustered ClickHouse deployment.

To configure the ClickHouse cluster, specify the following environment variables when launching Docker services on each node:

  • CLUSTER_NODE_ID — unique sequential node ID
  • CLUSTER_NODES — list of all cluster nodes (must be identical on all nodes), provided as a JSON array:
    [{"id":1,"host":"node1.example.com","replica":"01-01"},{"id":2,"host":"node2.example.com","replica":"01-02"},{"id":3,"host":"node3.example.com","replica":"01-03"}]
    
    Explanation:
    • id — sequential node number in the cluster
    • host — hostname or IP address of the node
    • replica — macro indicating the replica number, in the format 01-<replica_number>

It is acceptable to use only two nodes in the cluster for data storage. The third node does not store data but is mandatory for quorum maintenance—it acts as an arbiter, enabling the cluster to preserve consistency and avoid split-brain scenarios during temporary node unavailability.

Since replication synchronizes data across all participating nodes, specifying replica on all three nodes would cause redundant storage and slower writes—data would be triple-replicated instead of duplicated. Therefore, the baseline and recommended configuration is: two data nodes and one quorum-only node.

In this case, the replica field in CLUSTER_NODES should be specified only for data-storing nodes; it should be omitted for the arbiter. For example:

[{"id":1,"host":"node1.example.com","replica":"01-01"}, {"id":2,"host":"node2.example.com","replica":"01-02"}, {"id":3,"host":"node3.example.com"}]
Important

The third (arbiter) node can be deployed on any available server—it consumes minimal resources, as it does not store user data and only participates in cluster coordination (e.g., monitoring metadata changes and helping restore consistency after transient failures). This node must not be specified in application ClickHouse connection settings—queries should be directed only to data-storing nodes.

Important

For the cluster to operate correctly, network connectivity between nodes must be ensured on ports 9000, 8123, 2181, and 2180.

Securing Inter-Node Communication

To ensure secure communication between cluster nodes and prevent unauthorized access, it is recommended to configure a dedicated shared secret key.

The secret enables mutual authentication among cluster nodes. Without it, any server could attempt to join the cluster.

Only servers possessing the secret can participate in the cluster; therefore, the same secret key must be configured on all nodes.

To set the ClickHouse cluster secret, create a Docker secret using the following command:

echo -n "Secure_secret_123@" | docker secret create cluster_secret -

where Secure_secret_123@ is the configured secret value.

Note

A strong, complex secret value is recommended.

Example command for launching a Docker service on one cluster node:

docker service create --name operavix-clickhouse-node1 \
--secret operavix_app_user \
--secret operavix_app_user_password_hash \
--secret operavix_external_user \
--secret operavix_external_user_password_hash \
--secret operavix_clickhouse_dhparam.pem \
--secret operavix_clickhouse.crt \
--secret operavix_clickhouse.key \
--secret cluster_secret \
--publish published=8123,target=8123,mode=host \
--publish published=9000,target=9000,mode=host \
--publish published=2181,target=2181,mode=host \
--publish published=2180,target=2180,mode=host \
--mount type=volume,src=operavix-clickhouse,target=/var/lib/clickhouse/ \
--mount type=volume,src=operavix-clickhouse-log,target=/var/log/clickhouse-server \
-e CLUSTER_NODE_ID=1 \
-e CLUSTER_NODES='[{"id":1,"host":"node1.example.com","replica":"01-01"},{"id":2,"host":"node2.example.com","replica":"01-02"},{"id":3,"host":"node3.example.com","replica":"01-03"}]' \
--restart-max-attempts 5 \
--restart-condition "on-failure" \
operavix/operavix-clickhouse:25.3.2.39

After starting Docker services on all cluster nodes, configure the clustered connection in Operavix. Specify default as the cluster name.

Important

When configuring a connection to clustered ClickHouse on the Data Storage tab, enable cluster mode and list all data nodes in the cluster. Operavix will distribute queries across these nodes automatically. If connecting to clustered ClickHouse via a load balancer (e.g., Nginx), configure the connection in cluster mode but specify only the load balancer’s address as the host.

Was the article helpful?

Yes
No
Previous
Creating User Accounts in ClickHouse
We use cookies to improve our website for you.