本文主要是介绍用Sqoop进行Mysql 与HDFS / Hbase的互导数据,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
在Mysql中创建一个用户名:sqoop,密码:sqoop,并授权
grant all privileges on *.* to 'sqoop'@'%' identified by 'sqoop' with grant option;
可以在Linux命令行(不是在Mysql命令行),输入以下代码尝试有没有创建成功
mysql -usqoop -p回车键 sqoop
在sqoop用户下,创建sqoop数据库,在sqoop库中建表employee
create database sqoop;use sqoop;create table employee(employee_id int not null primary key, employee_name varchar(30)); insert into employee values(101,'zhangsan'); insert into employee values(102,'lisi'); insert into employee values(103,'wangwu');
此时,在sqoop库中的employee表中已经存在三条记录,可以select查询验证。
2、测试sqoop能否成功连接mysql
首先,查看mysql的端口号。在Mysql命令窗口输入:
输出端口号(默认是3306)。mysql> show variables like 'port'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | port | 3306 | +---------------+-------+ 1 row in set (0.00 sec)
然后,退出mysql:exit;
在linux命令行下输入:
因为我的mysql是装载本机上,所以是localhost。
会显示employee这个表的名字,结果如下:sqoop list-tables --connect jdbc:mysql://localhost:3306/sqoop --username sqoop --password sqoop
... ... 14/06/03 15:02:11 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. employee ... ...
3、将mysql中的sqoop库中的employee的数据导入到HDFS中
首先,启动hadoop,在hadoop安装目录下输入:
bin/start-all.sh
可以通过jps查看进程,看是否正常启动,以下结果显示启动正常:
3733 SecondaryNameNode 3196 NameNode 4089 TaskTracker 3833 JobTracker 22914 Jps 3471 DataNode
然后,在linux命令行输入:sqoop import --connect jdbc:mysql://localhost:3306/sqoop --username sqoop --password sqoop --table employee -m 1
如果过程最后显示:
14/06/03 15:13:35 INFO mapreduce.ImportJobBase: Transferred 33 bytes in 24.6435 seconds (1.3391 bytes/sec) 14/06/03 15:13:35 INFO mapreduce.ImportJobBase: Retrieved 3 records.
查看结果,在linux'命令行输入:
hadoop dfs -ls /user/cwjy1202/employee
显示结果如下:其中part-m-00000是表的数据信息,可以查看:-rw-r--r-- 1 cwjy1202 supergroup 0 2014-06-03 15:13 /user/cwjy1202/employee/_SUCCESS drwxr-xr-x - cwjy1202 supergroup 0 2014-06-03 15:13 /user/cwjy1202/employee/_logs -rw-r--r-- 1 cwjy1202 supergroup 33 2014-06-03 15:13 /user/cwjy1202/employee/part-m-00000
输出结果为:hadoop dfs -cat /user/cwjy1202/employee/part-m-00000
Warning: $HADOOP_HOME is deprecated.101,zhangsan 102,lisi 103,wangwu
导入HDFS成功!!!
4、将HDFS中的数据导入Mysql
首先,将mysql中的sqoop用户中的sqoop库中的employee表中的数据删除。
mysql> use sqoop Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -ADatabase changed mysql> delete from employee; Query OK, 3 rows affected (0.04 sec)mysql> select * from employee; Empty set (0.00 sec)
其次,在linux命令行输入:过程最后会显示,说明成功了:sqoop export --connect jdbc:mysql://localhost:3306/sqoop --username sqoop --password sqoop --table employee --export-dir hdfs://localhost:9002/user/cwjy1202/employee/part-m-00000
14/06/03 15:29:57 INFO mapreduce.ExportJobBase: Transferred 730 bytes in 33.2669 seconds (21.9437 bytes/sec) 14/06/03 15:29:57 INFO mapreduce.ExportJobBase: Exported 3 records.
注意:
这个sqoop是数据库的名字。jdbc:mysql://localhost:3306/sqoop
下面的localhost是masternode的地址,因为我的是伪分布式,masternode所在的地址是localhost。
--export-dirhdfs://localhost:9002
端口号9002,这个一定注意,一般人的是9000,我的9000被其他占用,我改为了9002,这个要看core-site.xml文件下的自己设置的端口号是什么。
此时,查看mysql中sqoop用户下的sqoop库中的employee表中有没有数据:
mysql> select * from employee; +-------------+---------------+ | employee_id | employee_name | +-------------+---------------+ | 101 | zhangsan | | 102 | lisi | | 103 | wangwu | +-------------+---------------+ 3 rows in set (0.00 sec)
导入mysql成功!!!
5、将Mysql数据导入Hbase
首先,在linux命令行输入:
sqoop import --connect jdbc:mysql://localhost/sqoop --username sqoop --password sqoop --table employee --hbase-create-table --hbase-table employee --column-family emplinfo --hbase-row-key employee_id
运行过程最后显示:
14/06/03 15:50:45 INFO mapred.JobClient: SPLIT_RAW_BYTES=361 14/06/03 15:50:45 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 48.8232 seconds (0 bytes/sec) 14/06/03 15:50:45 INFO mapreduce.ImportJobBase: Retrieved 3 records.
然后,在linux命令行输入:
hbase shell
在hbase命令行,查看表employee:hbase(main):001:0> scan 'employee' SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/local/hadoop/hbase-0.98.1-hadoop/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/local/hadoop/lib/slf4j-log4j12-1.4.3.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. ROW COLUMN+CELL 101 column=emplinfo:employee_name, timestamp=1401781827088, value=zhangsan 102 column=emplinfo:employee_name, timestamp=1401781827136, value=lisi 103 column=emplinfo:employee_name, timestamp=1401781839401, value=wangwu 3 row(s) in 1.9230 seconds
导入hbase成功!!!
这篇关于用Sqoop进行Mysql 与HDFS / Hbase的互导数据的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!