Hive写一个时间转换器的自定义函数（UDF）和创建hive自定义函数的两种方式

本文主要是介绍Hive写一个时间转换器的自定义函数（UDF）和创建hive自定义函数的两种方式，希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

在前面一篇文章的日志表中，时间的格式的是这样的"31/Aug/2015:00:04:37 +0800";这样并不友好，为了好看点，我们自定义一个时间格式化的udf函数，hive应该也提供时间转换的函数。

自定义函数

代码
自定义函数还是继承UDF类

package com.madman.hive.function;import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.Locale;import org.apache.commons.lang.StringUtils;
import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.io.Text;/*** * UDF函数还是老样子....* * /*** A User-defined function (UDF) for the use with Hive.** New UDF classes need to inherit from this UDF class.** Required for all UDF classes: 1. Implement one or more methods named* "evaluate" which will be called by Hive. The following are some examples:* public int evaluate(); public int evaluate(int a); public double evaluate(int* a, double b); public String evaluate(String a, int b, String c);** "evaluate" should never be a void method. However it can return "null" if* needed.*/
public class HiveDateFunction extends UDF {public Text evaluate(Text time) {if (time == null) {return null;}if (StringUtils.isBlank(time.toString())) {return null;}String parser = time.toString().replaceAll("\"", "");SimpleDateFormat inputSimple = new SimpleDateFormat("dd/MMM/yyyy:HH:mm:ss", Locale.ENGLISH);SimpleDateFormat outputSimple = new SimpleDateFormat("yyyyMMddHHmmss");String format = "";try {Date parse = inputSimple.parse(parser);format = outputSimple.format(parse);System.out.println(format);} catch (Exception e) {e.printStackTrace();return null;}return new Text(format);}public static void main(String[] args) {String text = "31/Aug/2015:00:04:37 +0800";System.out.println(new HiveDateFunction().evaluate(new Text(text)));System.exit(0);SimpleDateFormat inputSimple = new SimpleDateFormat("dd/MMM/yyyy:HH:mm:ss", Locale.ENGLISH);SimpleDateFormat outputSimple = new SimpleDateFormat("yyyyMMddHHmmss");try {Date parse = inputSimple.parse(text);String format = outputSimple.format(parse);System.out.println(format);} catch (Exception e) {e.printStackTrace();}}
}

代码写好之后本地先测试下，是否可行，可行之后打成jar包上传到hive环境中去，然后将jar加入到hive中。
参考命令：

hive (default)> add  jar /opt/cdhmoduels/data/hiveDateFunction.jar;

然后创建一个函数，参考命令：

create  temporary function hiveDateFunction as  'com.madman.hive.function.HiveDateFunction';
//这里需要制定类的路劲。

调用函数命令：

hive (default)> select hiveDateFunction(time_local) from bf_log limit 10;
结果：
Total MapReduce CPU Time Spent: 1 seconds 750 msec
OK
_c0
20150831000437
20150831000437
20150831000453
20150831000453
20150831000453
20150831000453
20150831000453
20150831000453
20150831000453
20150831000453
Time taken: 20.954 seconds, Fetched: 10 row(s)

hive自定义函数的两种方法

方式1

先上传jar包到hive的环境中，然后再定义函数指明类的具体路劲。

hive (default)> add  jar /opt/cdhmoduels/data/hiveDateFunction.jar;  
create  temporary function hiveDateFunction as  'com.madman.hive.function.HiveDateFunction';
测试SQL
hive (default)> select hiveDateFunction(time_local) from bf_log limit 10;

方式2

创建函数的时候直接指定类路劲和类所在jar的路劲，这里我是放在hdfs上面了，直接指定了hdfs的路劲。

create temporary function parseDate as 'com.madman.hive.function.HiveDateFunction' using jar 'hdfs://hadoop.madman.com:8020/jar/hiveDateFunction.jar'；
测试SQL
hive (default)> select parseDate(time_local) from bf_log limit 10;

这篇关于Hive写一个时间转换器的自定义函数（UDF）和创建hive自定义函数的两种方式的文章就介绍到这儿，希望我们推荐的文章对编程师们有所帮助！

Hive写一个时间转换器的自定义函数（UDF）和创建hive自定义函数的两种方式

自定义函数

hive自定义函数的两种方法

方式1

方式2

相关文章

SpringBoot中@Value注入静态变量方式

SpringBoot分段处理List集合多线程批量插入数据方式

Python的Darts库实现时间序列预测

MyBatis Plus实现时间字段自动填充的完整方案

python获取指定名字的程序的文件路径的两种方法

C++统计函数执行时间的最佳实践

Vite 打包目录结构自定义配置小结

HTTP 与 SpringBoot 参数提交与接收协议方式

C# LiteDB处理时间序列数据的高性能解决方案

GO语言中函数命名返回值的使用