本文主要是介绍摸鱼大数据——Hive函数10-12,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
10、堆内存错误
报错:
Error while processing statement: FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. Java heap space
解决方案: 在node1上面操作即可
方式1: 找到/export/server/hive/conf/hive-env.sh,添加以下内容
export HADOOP_HEAPSIZE=2048
方式2: 找到hive-site.xml添加以下内容
<!-- hive堆内存--><property><name>hive.heapsize</name><value>2048</value></property>
修改完以后,先把Hadoop和Hive进程全部关掉。先启动Hadoop,再启动Hive。
11、JSON数据处理
get_json_object:解析json内容优点:能够解析嵌套的json缺点:每次只能解析一个json_tuple:优点:每次能够同时解析多个字段缺点:不能解析嵌套的json。如果需要解析嵌套的,那么只能一层层解析
示例:
create database day09; use day09; /*get_json_object:解析json内容优点:能够解析嵌套的json缺点:每次只能解析一个*/selectget_json_object('{"name":"zhangshan","age":18,"addr":{"province":"广东省","city":"广州市"}}','$.name') as name,get_json_object('{"name":"zhangshan","age":18,"addr":{"province":"广东省","city":"广州市"}}','$.addr.province') as province; /*json_tuple:优点:每次能够同时解析多个字段缺点:不能解析嵌套的json。如果需要解析嵌套的,那么只能一层层解析*/ selectjson_tuple('{"name":"zhangshan","age":18,"addr":{"province":"广东省","city":"广州市"}}','name','age','addr'); selectjson_tuple('{"name":"zhangshan","age":18,"addr":{"province":"广东省","city":"广州市"}}','name','age','addr') as (name,age,addr); with tmp_1 as (select json_tuple('{"name":"zhangshan","age":18,"addr":{"province":"广东省","city":"广州市"}}', 'addr') as addr ) select get_json_object(addr,'$.province') from tmp_1; with tmp_1 as (select json_tuple('{"name":"zhangshan","age":18,"addr":{"province":"广东省","city":"广州市"}}', 'addr') as addr ) select json_tuple(addr,'province') from tmp_1;
12、炸裂函数
把一个容器的多个数据炸裂出单独展示: explode(字段名称) 炸裂函数配合侧视图使用如下 格式:select 字段,侧视图中字段名称 from 原始表名lateral view UDTF函数名称(原始表名中的字段) 侧视图别名 as 侧视图中字段名称1,侧视图中字段名称解释: 1- 侧视图别名、侧视图中字段名称自己取名字2- 侧视图别名前面不能有as3- 侧视图里面只需要定义字段名称即可,不要设置数据类型
简单示例:
use day09; -- 基础使用 select array(1,2,3,4,5); select explode(array(1,2,3,4,5)); select map('a',1,'b',2,'c',3); select explode(map('a',1,'b',2,'c',3));
实践:
-- NBA例子 create table nba(team_name string,year_str array<string> )row format delimited fields terminated by ',' collection items terminated by '|'; -- 加载导入数据 load data inpath '/dir/The_NBA_Championship.txt' into table nba; -- 验证数据 select * from nba; -- 炸裂 select explode(year_str) as `year` from nba; -- UDTF函数一般会和侧视图一起出现 /*侧视图语法select 字段,侧视图中字段名称 from 原始表名lateral view UDTF函数名称(原始表名中的字段) 侧视图别名 as 侧视图中字段名称1,侧视图中字段名称2....*/ select team_name,year from nba lateral view explode(year_str) years as year; -- 侧视图和json_tuple配合使用 with tmp_1 as (select json_tuple('{"name":"zhangshan","age":18,"addr":{"province":"广东省","city":"广州市"}}', 'addr') as addr ) select json_tuple(addr,'province') from tmp_1; with tmp_1 as (select 1 as id,'{"name":"zhangshan","age":18,"addr":{"province":"广东省","city":"广州市"}}' as info ) select id,name,age from tmp_1 lateral view json_tuple(info,'name','age') info_view as name,age
这篇关于摸鱼大数据——Hive函数10-12的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!