mysql-Synch-clickhouse

本文主要是介绍mysql-Synch-clickhouse，希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

Synch

GitHub - long2ice/synch: Sync data from the other DB to ClickHouse(cluster)

环境：

mysql5.7	redis >= 5.0	clickhouse21.2	postgresql	python3
binlog_format=row	XREAD	default	pg_config	synch

1：安装clickhouse

rpm下载地址：

https://repo.yandex.ru/clickhouse/rpm/stable/x86_6

安装：rpm -ivh ./*.rpm

配置：

/etc/clickhouse-server/config.xml
/etc/clickhouse-server/users.xml

服务：

systemctl start clickhouse-server

客户端：

clickhouse-client

2：安装Python3

系统默认： Python 2.7.5

安装：pip

yum -y install epel-release

yum install python-pip

pip --version

下载安装

wget https://www.python.org/ftp/python/3.7.0/Python-3.7.0.tar.xz

cd Python-3.7.0
./configure --prefix=/usr/local/python3 --enable-shared --enable-optimizations
make

make install

环境变量

/etc/profile

export PYTHON_HOME=/usr/local/python3

export PATH=$PYTHON_HOME/bin:$PATH

异常及处理：

/usr/local/python3/bin/python3.7: error while loading shared libraries: libpython3.7m.so.1.0: cannot open shared object file: No such file or directory

将python库的路径写到/etc/ld.so.conf配置中

vim /etc/ld.so.conf.d/python3.conf
/usr/local/python3/lib
ldconfig

升级pip

3：安装synch

pip3 install synch

安装异常：

需要安装：PostgreSQL，

yum install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm

yum install -y postgresql13-server

yum install postgresql-devel

psql --version

配置环境变量：

/usr/pgsql-13/bin/pg_config

export PATH=$PATH:/usr/pgsql-13/bin

安装成功：

查看synch

配置synch.yaml

core:debug: true # when set True, will display sql information.insert_num: 1 # how many num to submit,recommend set 20000 when productioninsert_interval: 1 # how many seconds to submit,recommend set 60 when production# enable this will auto create database `synch` in ClickHouse and insert monitor datamonitoring: truesentry:environment: developmentdsn:redis:host: 127.0.0.1port: 6379db: 0password:prefix: synchqueue_max_len: 200000 # stream max len, will delete redundant ones with FIFOsource_dbs:- db_type: mysqlalias: mysql_db # must be uniquebroker_type: redis # current support redis and kafkaserver_id: 3host: 127.0.0.1port: 3306user: rootpassword: "123456"# optional, auto get from `show master status` when emptyinit_binlog_file:# optional, auto get from `show master status` when emptyinit_binlog_pos:skip_dmls: alert # dmls to skipskip_delete_tables: # tables skip delete, format with schema.tableskip_update_tables: # tables skip update, format with schema.tabledatabases:- database: crm# optional, default true, auto create database when database in clickhouse not existsauto_create: truetables:- table: user_log# optional, default false, if your table has decimal column with nullable, there is a bug with full data etl will, see https://github.com/ClickHouse/ClickHouse/issues/7690.skip_decimal: false # set it true will replace decimal with string type.# optional, default trueauto_full_etl: true # auto do full etl at first when table not exists# optional, default ReplacingMergeTreeclickhouse_engine: ReplacingMergeTree # current support MergeTree, CollapsingMergeTree, VersionedCollapsingMergeTree, ReplacingMergeTree# optionalpartition_by: # Table create partitioning by, like toYYYYMM(created_at).# optionalsettings: # Table create settings, like index_granularity=8192# optionalsign_column: sign # need when clickhouse_engine=CollapsingMergeTree and VersionedCollapsingMergeTree, no need real in source db, will auto generate in clickhouse# optionalversion_column: # need when clickhouse_engine=VersionedCollapsingMergeTree and ReplacingMergeTree(optional), need real in source db, usually is `updated_at` with auto update.- table: deptinfo- table: userclickhouse:hosts:- 127.0.0.1:9000user: defaultpassword: ''cluster_name: #perftest_3shards_1replicasdistributed_suffix: ###_all # distributed tables suffix, available in cluster#kafka:
#  servers:
#    - kafka:9092
#  topic_prefix: synch# enable this to send error report, comment or delete these if not.
mail:mailhost: smtp.gmail.comfromaddr: long2ice@gmail.comtoaddrs:- long2ice@gmail.comuser: long2ice@gmail.compassword: "123456"subject: "[synch] Error logging report"

4：测试

1：create 。。 if not exists

synch -c /etc/synch.yaml --alias mysql_db etl --schema crm --table user

2：生产

监听源库并将变动数据写入消息队列。

synch --alias mysql_db produce

3：消费

从消息队列中消费数据并插入 ClickHouse，使用 --skip-error跳过错误消息。配置 auto_full_etl = True 的时候会首先尝试做一次全量复制。

消费数据库 crm 并插入到ClickHouse：

synch --alias mysql_db consume --schema crm

5：安装supervisord守护进程

yum install supervisor

配置

[program:mysql-to-ck-produce]
process_name=%(program_name)s
command=/usr/local/python3/bin/synch -c /etc/synch.yaml --alias mysql_db produce
autostart=true
autorestart=true
redirect_stderr=true
stdout_logfile=/var/log/supervisor/%(program_name)s.log
stdout_logfile_maxbytes=2048MB
stdout_logfile_backups=20
stopwaitsecs=3600[program:mysql-to-ck-consume-crm]
process_name=%(program_name)s
command=/usr/local/python3/bin/synch -c /etc/synch.yaml --alias mysql_db consume --schema crm
autostart=true
autorestart=true
redirect_stderr=true
stdout_logfile=/var/log/supervisor/%(program_name)s.log
stdout_logfile_maxbytes=2048MB
stdout_logfile_backups=20
stopwaitsecs=3600

服务：

systemctl restart supervisord

日志：

/var/log/supervisor

mysql-to-ck-consume-crm.log mysql-to-ck-produce.log supervisord.log

这篇关于mysql-Synch-clickhouse的文章就介绍到这儿，希望我们推荐的文章对编程师们有所帮助！

mysql-Synch-clickhouse

Synch

环境：

1：安装clickhouse

rpm下载地址：

配置：

服务：

客户端：

2：安装Python3

下载安装

环境变量

异常及处理：

升级pip

3：安装synch

安装异常：

安装成功：

查看synch

配置synch.yaml

4：测试

1：create 。。 if not exists

2：生产

3：消费

5：安装supervisord守护进程

配置

服务：

日志：

相关文章

MySQL 删除数据详解(最新整理)

MySQL中查找重复值的实现

从入门到精通MySQL联合查询

MySQL查询JSON数组字段包含特定字符串的方法

mysql表操作与查询功能详解

MySQL中的锁机制详解之全局锁,表级锁,行级锁

MySQL数据库中ENUM的用法是什么详解

MySQL count()聚合函数详解

mysql中的服务器架构详解

MySQL之InnoDB存储引擎中的索引用法及说明