本文主要是介绍datax离线同步oracle表到clickhouse实践2,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
时间:2024.01
目录
1、安装启动 oracle19c 容器
2、rpm包安装clickhouse
3、datax安装
4、datax同步
目标库根据要同步的表,按照clickhouse建表规范建表
编写json文件
编写增量同步shell脚本,加入 crond 定时任务
接上一篇
tar -zxvf datax_ck.tar.gz -C /root/
1、把历史数据 20240201 之前的数据一次性同步到 clickhouse
cd /root/datax/bin
mkdir -p tables/test01
cd tables/test01
vim test.json
{"job": {"content": [{"reader": { "name": "oraclereader", "parameter": { "connection": [ { "jdbcUrl": ["jdbc:oracle:thin:@192.168.15.6:11521:ORCLPDB1"], "querySql": ["select * from TDBA_TEST01 WHERE to_char(create_date,'yyyymmdd')<='20240201'"]}], "username": "bigdata","password": "bigdata" } },"writer": {"name": "clickhousewriter","parameter": {"username": "default","password": "bigdata","column":["*"],"connection": [{"jdbcUrl": "jdbc:clickhouse://192.168.15.7:8123/default","table":["TEST01"]}]}}}],"setting": {"speed": {"channel":1 }}}
}
手动执行同步
cd /root/datax/bin
./datax.py tables/test01/test.json
2、同步增量数据
cd /root/datax/bin/tables/test01/
vim test01.json
{"job": {"content": [{"reader": { "name": "oraclereader", "parameter": { "connection": [ { "jdbcUrl": ["jdbc:oracle:thin:@192.168.15.6:11521:ORCLPDB1"], "querySql": ["select * from TDBA_TEST01 WHERE to_char(create_date,'yyyymmdd')='20240202'"]}], "username": "bigdata","password": "bigdata" } },"writer": {"name": "clickhousewriter","parameter": {"username": "default","password": "bigdata","column":["*"],"connection": [{"jdbcUrl": "jdbc:clickhouse://192.168.15.7:8123/default","table":["TEST01"]}]}}}],"setting": {"speed": {"channel":1 }}}
}
注:json文件只是修改了sql的条件,其他没有变化。
编写shell脚本
test01.sh
#!/bin/bash
echo $PATH
PATH=/etl/jdk1.8.0_201/bin:$PATH
echo $PATH
etl_date=$(date -d "`date +%Y%m%d` -3 day" +%Y%m%d)
sed "s/20240202/$etl_date/" /root/datax/bin/tables/test01.json >/root/datax/bin/tables/test01_final.json
/root/datax/bin/datax.py /root/datax/bin/tables/test01_final.json >>/root/datax/bin/test01_final.log
加入到定时任务(分时天月周),每天6点执行
[root@docker bin]# crontab -e
0 6 * * * /root/datax/bin/tables/test01/test01.sh > ~/crontab.log
注:$PATH 环境变量信息,重定向到 crontab.log,方便调试
这篇关于datax离线同步oracle表到clickhouse实践2的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!