本文主要是介绍04_matrixdb通过pxf读写hive,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
matrixdb视频教程 - 04matrixdb通过pxf读写hive
作者
shidb
日期
2021-05-11
标签
matrixdb视频教程 - 04matrixdb通过pxf读写hive
Matrixdb 通过数据联邦pxf访问hive
环境介绍
sdw5 namenode hive
sdw3 matrixdb集群
hive建表及加载数据
数据样例
vim mppinfo.csv
1,matrixdb,北京
2,teradata,米国
3,hana,米国
创建表
DROP TABLE IF EXISTS mpp_info;
CREATE TABLE IF NOT EXISTS mpp_info(
id int,
name string,
area string
)
COMMENT 'mpp details'
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n' STORED AS TEXTFILE;
数据加载
load data local inpath '/data/dataload/mppinfo.csv' into table mpp_info;hive> load data local inpath '/data/dataload/mppinfo.csv' into table mpp_info;
Loading data to table default.mpp_info
OK
Time taken: 2.542 seconds
hive数据查看
hive> select * from mpp_info;
OK
1 matrixdb 北京
2 teradata 米国
3 hana 米国
Time taken: 0.318 seconds, Fetched: 3 row(s)
matrixdb pxf配置
配置pxf
将hdfs下的core-site.xml、hdfs-site.xml拷贝到
[root@sdw5 ~]# cp /etc/hadoop/conf.cloudera.hdfs/core-site.xml .
[root@sdw5 ~]# cp /etc/hadoop/conf.cloudera.hdfs/hdfs-site.xml .拷贝到所有matrixdb pxf节点的如下路径
cp core-site.xml /usr/local/pxf-matrixdb3/conf/servers/single_hdfs/
cp hdfs-site.xml /usr/local/pxf-matrixdb3/conf/servers/single_hdfs/
配置pxf
如果之前已经配置过,需要重置
停止pxf
[shidb@sdw3 ~]$ pxf cluster stop
Stopping PXF on 1 segment host…
PXF stopped successfully on 1 out of 1 host
重置pxf配置
[shidb@sdw3 ~]$ pxf cluster reset
Ensure your PXF cluster is stopped before continuing. This is a destructive action. Press y to continue:
y
Resetting PXF on master host and 0 segment hosts...
PXF has been reset on 1 out of 1 host
初始化pxf
[shidb@sdw3 ~]$ pxf cluster init
Initializing PXF on master host and 0 segment hosts...
PXF initialized successfully on 1 out of 1 host
pxf同步
[shidb@sdw3 ~]$ pxf cluster sync
Syncing PXF configuration files from master host to 1 segment host...
PXF configs synced successfully on 1 out of 1 host
启动PXF
[shidb@sdw3 ~]$ pxf cluster start
Starting PXF on 1 segment host...
PXF started successfully on 1 out of 1 host
查看pxf状态
[shidb@sdw3 ~]$ pxf cluster status
Checking status of PXF servers on 1 segment host...
PXF is running on 1 out of 1 host
master上单独也需要启动
[shidb@sdw3 ~]$ pxf start
Using CATALINA_BASE: /usr/local/pxf-matrixdb3/pxf-service
Using CATALINA_HOME: /usr/local/pxf-matrixdb3/pxf-service
Using CATALINA_TMPDIR: /usr/local/pxf-matrixdb3/pxf-service/temp
Using JRE_HOME: /usr/local/jdk1805/jre
Using CLASSPATH: /usr/local/pxf-matrixdb3/pxf-service/bin/bootstrap.jar:/usr/local/pxf-matrixdb3/pxf-service/bin/tomcat-juli.jar
Using CATALINA_PID: /usr/local/pxf-matrixdb3/run/catalina.pid
Existing PID file found during start.
Tomcat appears to still be running with PID 763115. Start aborted.
If the following process is not a Tomcat process, remove the PID file and try again:
UID PID PPID C STIME TTY TIME CMD
shidb 763115 1 99 12:27 pts/178 00:00:28 /usr/local/jdk1805/jre/bin/java -Djava.util.logging.config.file=/usr/local/pxf-matrixdb3/pxf-service/conf/logging.properties -Djava.util.logg
[shidb@sdw3 ~]$
matrixdb配置
创建FDW Server
CREATE SERVER hdfs_svr FOREIGN DATA WRAPPER hdfs_pxf_fdw OPTIONS ( config ‘single_hdfs’ );
创建用户名admin的FDW User mapping
CREATE USER MAPPING FOR shidb SERVER hdfs_svr;
CREATE USER MAPPING FOR hive SERVER hdfs_svr;
matrixdb中执行创建外部表语法
DROP FOREIGN TABLE mpp_info;
CREATE FOREIGN TABLE mpp_info(
id int,
name text,
area text
)
SERVER hdfs_svr OPTIONS ( resource '/user/hive/warehouse/mpp_info', format 'text',delimiter ',');
查询
postgres=# select * from mpp_info;id | name | area
----+----------+------1 | matrixdb | 北京2 | teradata | 米国3 | hana | 米国
(3 rows)
插入数据
postgres=# insert into mpp_info select * from mpp_info;
INSERT 0 3
postgres=# select * from mpp_info;id | name | area
----+----------+------1 | matrixdb | 北京2 | teradata | 米国3 | hana | 米国1 | matrixdb | 北京2 | teradata | 米国3 | hana | 米国
(6 rows)
如需入群沟通交流,请扫码添加好友
这篇关于04_matrixdb通过pxf读写hive的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!