一、高可用架构
ClickHouse集群:
3个节点:1个节点,2个ClickHouse实例;3分片,2副本:
Centos-1: 实例1, 端口: tcp 9000, http 8123, 同步端口9009, 类型: 分片1, 副本1
Centos-1: 实例2, 端口: tcp 9002, http 8124, 同步端口9010, 类型: 分片3, 副本2 (Centos-2的副本)
Centos-2: 实例1, 端口: tcp 9000, http 8123, 同步端口9009, 类型: 分片2, 副本1
Centos-2: 实例2, 端口: tcp 9002, http 8124, 同步端口9010, 类型: 分片1, 副本2 (Centos-3的副本)
Centos-3: 实例1, 端口: tcp 9000, http 8123, 同步端口9009, 类型: 分片3, 副本1
Centos-3: 实例2, 端口: tcp 9002, http 8124, 同步端口9010, 类型: 分片2, 副本2 (Centos-1的副本)
二、环境准备
软件
名称 版本 备注 操作系统 Centos7.8 zookeeper 3.7.0 clickhouse 21.12.3 jdk 1.8.0_161 硬件
IP host software 192.168.0.27 ecs-clickhouse-001 jdk,zookeeper,clickhouse 192.168.0.221 ecs-clickhouse-002 jdk,zookeeper,clickhouse 192.168.0.74 ecs-clickhouse-003 jdk,zookeeper,clickhouse
三、部署
JDK安装
在3台Linux裸机服务器上分别安装JDK(要求版本不低于JDK8)
(1)官网或者其他途径下载jdk 安装包
(2)解压安装包
tar -xvf jdk-8u161-linux-x64.tar.gz
(3)将解压包拷贝到/usr/java/下
mkdir -p /usr/java cp -r jdk1.8.0_161 /usr/java/
(4)添加环境变量
在/etc/profile文件最后添加变量
export JAVA_HOME=/usr/java/jdk1.8.0_161 export JRE_HOME=${JAVA_HOME}/jre export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib:$CLASSPATH export JAVA_PATH=${JAVA_HOME}/bin:${JRE_HOME}/bin export PATH=$PATH:${JAVA_PATH}
(5)设置配置文件生效
source /etc/profile java -version
zookeeper集群安装
(1)下载安装包
wget https://dlcdn.apache.org/zookeeper/zookeeper-3.7.0/apache-zookeeper-3.7.0-bin.tar.gz
(2)解压到/usr/local
tar -zxvf apache-zookeeper-3.7.0-bin.tar.gz -C /usr/local/ cd /usr/local mv apache-zookeeper-3.7.0-bin zookeeper
(3)配置文件
进入到/usr/local/zookeeper/conf/下,拷贝文件zoo_sample.cfg 为zoo.cfg
cp zoo_sample.cfg zoo.cfg
修改文件zoo.cfg
initLimit
tickTime的个数,表示在leader选举结束后,followers与leader同步需要的时间,如果followers比较多或者说leader的数据灰常多时,同步时间相应可能会增加,那么这个值也需要相应增加。当然,这个值也是follower和observer在开始同步leader的数据时的最大等待时间(setSoTimeout)
syncLimit
tickTime的个数,这时间容易和上面的时间混淆,它也表示follower和observer与leader交互时的最大等待时间,只不过是在与leader同步完毕之后,进入正常请求转发或ping等消息交互时的超时时间。
leaderServes
(Java system property: zookeeper.leaderServes) 如果该值不是no,则表示该服务器作为leader时是需要接受客户端连接的。为了获得更高吞吐量,当服务器数三台以上时一般建议设置为no。
cnxTimeout
(Java system property: zookeeper.cnxTimeout) 默认值是5000,单位ms 表示leaderelection时打开连接的超时时间,只用在算法3中。
preAllocSize
(Java system property: zookeeper.preAllocSize)默认值64M,以KB为单位,预先分配额定空间用于后续transactionlog 写入,每当剩余空间小于4K时,就会又分配64M,如此循环。如果SNAP做得比较频繁(snapCount比较小的时候),那么请减少这个值。
snapCount
(Java system property: zookeeper.snapCount)默认值100,000,当transaction每达到snapCount/2+rand.nextInt(snapCount/2)时,就做一次SNAPSHOT,默认情况下是50,000~100,000条transactionlog就会做一次,之所以用随机数是为了避免所有服务器可能在同一时间做snapshot.
maxClientCnxns
默认值是10,一个客户端能够连接到同一个服务器上的最大连接数,根据IP来区分。如果设置为0,表示没有任何限制。设置该值一方面是为了防止DoS攻击。
maxSessionTimeout
最大的session time 时间,默认值是20个tick time. ,客户端设置的session time 如果大于这个值,则会被强制协调为这个最大值。
(4)拷贝到其他几台zookeeper机器上
从第一台拷贝到第二台:scp -r zookeeper root@192.168.0.221:/usr/local/
查看第二台结果:
从第一台拷贝到第三台同样方式
(5)创建目录(zookeeper每台机器执行)
mkdir -p /data/zookeeper/{data,logs}
(6)创建myid文件
ecs-clickhouse-001
echo 1 > /data/zookeeper/data/myid
ecs-clickhouse-002
echo 2 > /data/zookeeper/data/myid
ecs-clickhouse-003
echo 3 > /data/zookeeper/data/myid
(7)配置环境变量并生效(zookeeper每台都需执行)
在/etc/profile 文件最后添加变量
export ZOOKEEPER_HOME=/usr/local/zookeeper
export PATH=$PATH:$ZOOKEEPER_HOME/bin
环境变量生效
source /etc/profile
(8)配置zookeeper日志
默认zk日志输出到一个文件,且不会自动清理,所以,一段时间后zk日志会非常大。
zookeeper-env.sh ./conf目录下新建zookeeper-env.sh文件,修改到sudo chmod 755 zookeeper-env.sh权限
先创建目录:/usr/local/zookeeper/logs
log4j.properties修改日志的输入形式
(9)配置运行zk的JVM
conf目录下新建java.env文件,修改到sudo chmod 755 java.env权限,主要用于GC log,RAM等的配置.
(10)启动
3台zookeeper都需要启动
/usr/local/zookeeper/bin/zkServer.sh start
(11)检查服务是否正常
3台zookeeper 分别执行,1个leader ,2个follower
/usr/local/zookeeper/bin/zkServer.sh status
(12)验证zk
./bin/zkCli.sh -server 192.168.0.27:2181
clickhouse集群安装
(1)检查CPU支持SSE4.2指令集
grep -q sse4_2 /proc/cpuinfo && echo “SSE 4.2 supported” || echo “SSE 4.2 not supported”
(2)查看Linux版本
cat /proc/version
(3)根据Linux版本(el7.x86_64)下载对应的安装包
安装包下载源配置
yum install yum-utils
rpm --import https://repo.clickhouse.tech/CLICKHOUSE-KEY.GPG
yum-config-manager --add-repo https://repo.clickhouse.tech/rpm/stable/x86_64
离线下载安装包,放到/root/clickhouse/下
yum install -y clickhouse-server clickhouse-client --downloadonly --downloaddir=/root/clickhouse/
(4)安装(将安装包拷贝到其他几台机器上,分别执行)
拷贝到其他几台机器上:
进入安装包目录/root/clickhouse/,执行安装命令:
rpm -ivh clickhouse-common-static-21.12.3.32-2.x86_64.rpm clickhouse-server-21.12.3.32-2.noarch.rpm clickhouse-client-21.12.3.32-2.noarch.rpm
中间需要输入默认密码,回车就行,不需要输入,等后期在配置。
(5)创建目录
ecs-clickhouse-001:
mkdir -p /data/clickhouse/{node1,node4}/{data,tmp,logs}
ecs-clickhouse-002:
mkdir -p /data/clickhouse/{node2,node5}/{data,tmp,logs}
ecs-clickhouse-003:
mkdir -p /data/clickhouse/{node3,node6}/{data,tmp,logs}
(6)用户名密码生成
超级用户:default
密码:supersecrect
密码password_sha256_hex:echo -n "supersecrect" | sha256sum | tr -d '-'
普通用户:emqx
密码:emqx
密码password_sha256_hex:echo -n "emqx" | sha256sum | tr -d '-'
(7)创建配置文件
node1配置文件:
config.xml <?xml version="1.0"?> <yandex> <!--日志--> <logger> <level>warning</level> <log>/data/clickhouse/node1/logs/clickhouse.log</log> <errorlog>/data/clickhouse/node1/logs/error.log</errorlog> <size>500M</size> <count>5</count> </logger> <!--本地节点信息--> <http_port>8123</http_port> <tcp_port>9000</tcp_port> <interserver_http_port>9009</interserver_http_port> <interserver_http_host>192.168.0.27</interserver_http_host> <!--本机域名或IP--> <!--本地配置--> <listen_host>0.0.0.0</listen_host> <max_connections>2048</max_connections> <keep_alive_timeout>3</keep_alive_timeout> <max_concurrent_queries>64</max_concurrent_queries> <uncompressed_cache_size>4294967296</uncompressed_cache_size> <mark_cache_size>5368709120</mark_cache_size> <path>/data/clickhouse/node1/</path> <tmp_path>/data/clickhouse/node1/tmp/</tmp_path> <users_config>/data/clickhouse/node1/users.xml</users_config> <default_profile>default</default_profile> <query_log> <database>system</database> <table>query_log</table> <partition_by>toMonday(event_date)</partition_by> <flush_interval_milliseconds>7500</flush_interval_milliseconds> </query_log> <query_thread_log> <database>system</database> <table>query_thread_log</table> <partition_by>toMonday(event_date)</partition_by> <flush_interval_milliseconds>7500</flush_interval_milliseconds> </query_thread_log> <prometheus> <endpoint>/metrics</endpoint> <port>8001</port> <metrics>true</metrics> <events>true</events> <asynchronous_metrics>true</asynchronous_metrics> </prometheus> <default_database>default</default_database> <timezone>Asia/Shanghai</timezone> <!--集群相关配置--> <remote_servers incl="clickhouse_remote_servers" /> <zookeeper incl="zookeeper-servers" optional="true" /> <macros incl="macros" optional="true" /> <builtin_dictionaries_reload_interval>3600</builtin_dictionaries_reload_interval> <max_session_timeout>3600</max_session_timeout> <default_session_timeout>300</default_session_timeout> <max_table_size_to_drop>0</max_table_size_to_drop> <merge_tree> <parts_to_delay_insert>300</parts_to_delay_insert> <parts_to_throw_insert>600</parts_to_throw_insert> <max_delay_to_insert>2</max_delay_to_insert> </merge_tree> <max_table_size_to_drop>0</max_table_size_to_drop> <max_partition_size_to_drop>0</max_partition_size_to_drop> <distributed_ddl> <!-- Path in ZooKeeper to queue with DDL queries --> <path>/clickhouse/task_queue/ddl</path> </distributed_ddl> <include_from>/data/clickhouse/node1/metrika.xml</include_from> </yandex>
metrika.xml <?xml version="1.0"?> <yandex> <!--ck集群节点--> <clickhouse_remote_servers> <emqx_cluster_all> <!--分片1--> <shard> <internal_replication>true</internal_replication> <replica> <host>192.168.0.27</host> <port>9000</port> <user>default</user> <password>supersecrect</password> </replica> <!--复制集1--> <replica> <host>192.168.0.74</host> <port>9002</port> <user>default</user> <password>supersecrect</password> </replica> </shard> <!--分片2--> <shard> <internal_replication>true</internal_replication> <replica> <host>192.168.0.221</host> <port>9000</port> <user>default</user> <password>supersecrect</password> </replica> <!--复制集2--> <replica> <host>192.168.0.27</host> <port>9002</port> <user>default</user> <password>supersecrect</password> </replica> </shard> <!--分片3--> <shard> <internal_replication>true</internal_replication> <replica> <host>192.168.0.74</host> <port>9000</port> <user>default</user> <password>supersecrect</password> </replica> <!--复制集3--> <replica> <host>192.168.0.221</host> <port>9002</port> <user>default</user> <password>supersecrect</password> </replica> </shard> </emqx_cluster_all> </clickhouse_remote_servers> <!--zookeeper相关配置--> <zookeeper-servers> <node index="1"> <host>192.168.0.27</host> <port>2181</port> </node> <node index="2"> <host>192.168.0.221</host> <port>2181</port> </node> <node index="3"> <host>192.168.0.74</host> <port>2181</port> </node> </zookeeper-servers> <macros> <layer>01</layer> <shard>01</shard> <!--分片号--> <replica>node1</replica> <!--当前节点IP--> </macros> <networks> <ip>::/0</ip> </networks> <!--压缩相关配置--> <clickhouse_compression> <case> <min_part_size>10000000000</min_part_size> <min_part_size_ratio>0.01</min_part_size_ratio> <method>lz4</method> <!--压缩算法lz4压缩比zstd快, 更占磁盘--> </case> </clickhouse_compression> </yandex>
users.xml <?xml version="1.0"?> <yandex> <profiles> <default> <!-- 请根据自己机器实际内存配置 --> <max_memory_usage>54975581388</max_memory_usage> <max_memory_usage_for_all_queries>61847529062</max_memory_usage_for_all_queries> <max_bytes_before_external_group_by>21474836480</max_bytes_before_external_group_by> <max_bytes_before_external_sort>21474836480</max_bytes_before_external_sort> <use_uncompressed_cache>0</use_uncompressed_cache> <load_balancing>random</load_balancing> <distributed_aggregation_memory_efficient>1</distributed_aggregation_memory_efficient> <max_threads>8</max_threads> <log_queries>1</log_queries> <receive_timeout>800</receive_timeout> <send_timeout>800</send_timeout> </default> <readonly> <max_threads>8</max_threads> <max_memory_usage>54975581388</max_memory_usage> <max_memory_usage_for_all_queries>61847529062</max_memory_usage_for_all_queries> <max_bytes_before_external_group_by>21474836480</max_bytes_before_external_group_by> <max_bytes_before_external_sort>21474836480</max_bytes_before_external_sort> <use_uncompressed_cache>0</use_uncompressed_cache> <load_balancing>random</load_balancing> <readonly>1</readonly> <distributed_aggregation_memory_efficient>1</distributed_aggregation_memory_efficient> <log_queries>1</log_queries> </readonly> </profiles> <quotas> <default> <interval> <duration>3600</duration> <queries>0</queries> <errors>0</errors> <result_rows>0</result_rows> <read_rows>0</read_rows> <execution_time>0</execution_time> </interval> </default> </quotas> <users> <default> <password>supersecrect</password> <networks> <ip>::/0</ip> </networks> <profile>default</profile> <quota>default</quota> </default> <emqx> <password_sha256_hex>fe56917863d5980019f4b6405239aa2d8e8a967b68c30987dc65f60eaf56cec3</password_sha256_hex> <networks> <ip>::/0</ip> </networks> <profile>default</profile> <quota>default</quota> </emqx> <read> <password_sha256_hex>fe56917863d5980019f4b6405239aa2d8e8a967b68c30987dc65f60eaf56cec3</password_sha256_hex> <networks> <ip>::/0</ip> </networks> <profile>readonly</profile> <quota>default</quota> </read> </users> </yandex>
node2配置文件:
config.xml <?xml version="1.0"?> <yandex> <!--日志--> <logger> <level>warning</level> <log>/data/clickhouse/node2/logs/clickhouse.log</log> <errorlog>/data/clickhouse/node2/logs/error.log</errorlog> <size>500M</size> <count>5</count> </logger> <!--本地节点信息--> <http_port>8123</http_port> <tcp_port>9000</tcp_port> <interserver_http_port>9009</interserver_http_port> <interserver_http_host>192.168.0.221</interserver_http_host> <!--本机域名或IP--> <!--本地配置--> <listen_host>0.0.0.0</listen_host> <max_connections>2048</max_connections> <keep_alive_timeout>3</keep_alive_timeout> <max_concurrent_queries>64</max_concurrent_queries> <uncompressed_cache_size>4294967296</uncompressed_cache_size> <mark_cache_size>5368709120</mark_cache_size> <path>/data/clickhouse/node2/</path> <tmp_path>/data/clickhouse/node2/tmp/</tmp_path> <users_config>/data/clickhouse/node2/users.xml</users_config> <default_profile>default</default_profile> <query_log> <database>system</database> <table>query_log</table> <partition_by>toMonday(event_date)</partition_by> <flush_interval_milliseconds>7500</flush_interval_milliseconds> </query_log> <query_thread_log> <database>system</database> <table>query_thread_log</table> <partition_by>toMonday(event_date)</partition_by> <flush_interval_milliseconds>7500</flush_interval_milliseconds> </query_thread_log> <prometheus> <endpoint>/metrics</endpoint> <port>8001</port> <metrics>true</metrics> <events>true</events> <asynchronous_metrics>true</asynchronous_metrics> </prometheus> <default_database>default</default_database> <timezone>Asia/Shanghai</timezone> <!--集群相关配置--> <remote_servers incl="clickhouse_remote_servers" /> <zookeeper incl="zookeeper-servers" optional="true" /> <macros incl="macros" optional="true" /> <builtin_dictionaries_reload_interval>3600</builtin_dictionaries_reload_interval> <max_session_timeout>3600</max_session_timeout> <default_session_timeout>300</default_session_timeout> <max_table_size_to_drop>0</max_table_size_to_drop> <merge_tree> <parts_to_delay_insert>300</parts_to_delay_insert> <parts_to_throw_insert>600</parts_to_throw_insert> <max_delay_to_insert>2</max_delay_to_insert> </merge_tree> <max_table_size_to_drop>0</max_table_size_to_drop> <max_partition_size_to_drop>0</max_partition_size_to_drop> <distributed_ddl> <!-- Path in ZooKeeper to queue with DDL queries --> <path>/clickhouse/task_queue/ddl</path> </distributed_ddl> <include_from>/data/clickhouse/node2/metrika.xml</include_from> </yandex>