flink Transformation算子(更新中)

2024-06-03 02:28

本文主要是介绍flink Transformation算子(更新中),希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

flink Transformation算子部分

Transformation算子

map

该方法是将一个DataStream调用map方法返回一个新的DataStream。本质是将该DataStream中对应的每一条数据依次迭代出来,应用map方法传入的计算逻辑,返回一个新的DataStream。原来的DataStream中对应的每一条数据,与新生成的DataStream中数据是一一对应的,也可以说是存在着映射关系的。
package com.lyj.sx.flink.day03;import org.apache.flink.api.common.functions.RichMapFunction;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;public class MapDemo  {public static void main(String[] args) throws  Exception {StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();DataStreamSource<String> dataSource = env.socketTextStream("192.168.25.62", 8899);dataSource.map(new QueryCategoryNameFromMySQLFunction()).print();env.execute();}
}
 package com.lyj.sx.flink.day03;import org.apache.flink.api.common.functions.RichMapFunction;
import org.apache.flink.api.java.tuple.Tuple4;
import org.apache.flink.configuration.Configuration;import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;public class QueryCategoryNameFromMySQLFunction extends RichMapFunction<String, Tuple4<String, String, String, Double>> {private Connection connection;private PreparedStatement preparedStatement;@Overridepublic void open(Configuration parameters) throws Exception {connection=DriverManager.getConnection("jdbc:mysql://localhost:3306/dy_flink?characterEncoding=utf-8","root","");preparedStatement=connection.prepareStatement("select name from t_category where id = ?");super.open(parameters);}@Overridepublic Tuple4<String, String, String, Double> map(String s) throws Exception {String[] fields = s.split(",");String cid = fields[0];preparedStatement.setInt(1,Integer.parseInt(cid));ResultSet resultSet = preparedStatement.executeQuery();String name = "未知";while (resultSet.next()){name = resultSet.getString(1);}resultSet.close();return  Tuple4.of(fields[0], fields[1], name, Double.parseDouble(fields[3]));}@Overridepublic void close() throws Exception {if(connection!=null){connection.close();}if(preparedStatement!=null){preparedStatement.close();}}
}

flatMap扁平化映射(DataStream → DataStream)

- 该方法是将一个DataStream调用flatMap方法返回一个新的DataStream,本质上是将该DataStream中的对应的每一条数据依次迭代出来,应用flatMap方法传入的计算逻辑,返回一个新的DataStream。原来的DataStream中输入的一条数据经过flatMap方法传入的计算逻辑后,会返回零到多条数据。所谓的扁平化即将原来的数据压平,返回多条数据。
package com.lyj.sx.flink.day03;import org.apache.flink.api.common.functions.FlatMapFunction;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.util.Collector;public class FlatMapDemo1 {public static void main(String[] args) throws Exception {StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironmentWithWebUI(new Configuration());DataStreamSource<String> source = env.socketTextStream("192.168.25.62", 8889);SingleOutputStreamOperator<Tuple2<String, Integer>> flatMapData = source.flatMap(new FlatMapFunction<String, Tuple2<String, Integer>>() {@Overridepublic void flatMap(String value, Collector<Tuple2<String, Integer>> out) throws Exception {String[] strings = value.split(" ");for (String string : strings) {out.collect(Tuple2.of(string, 1));}}});flatMapData.print();env.execute("pxj");}
}
package com.lyj.sx.flink.day03;import org.apache.flink.api.common.functions.FlatMapFunction;
import org.apache.flink.api.common.typeinfo.TypeHint;
import org.apache.flink.api.common.typeinfo.TypeInformation;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.util.Collector;public class FlatMapDemo2 {public static void main(String[] args) throws Exception {StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironmentWithWebUI(new Configuration());DataStreamSource<String> source = env.socketTextStream("192.168.25.62", 8887);SingleOutputStreamOperator<Tuple2<String, Integer>> myflatMap = source.transform("MyflatMap", TypeInformation.of(new TypeHint<Tuple2<String, Integer>>() {}), new MyflatMap());myflatMap.print();env.execute("pxj");}
}
package com.lyj.sx.flink.day03;import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.streaming.api.operators.AbstractStreamOperator;
import org.apache.flink.streaming.api.operators.OneInputStreamOperator;
import org.apache.flink.streaming.runtime.streamrecord.StreamRecord;public  class MyflatMap extends AbstractStreamOperator<Tuple2<String, Integer>> implements OneInputStreamOperator<String, Tuple2<String, Integer>> {@Overridepublic void processElement(StreamRecord<String> element) throws Exception {String[] split = element.getValue().split(",");for (String s : split) {output.collect(element.replace(Tuple2.of(s,1)));}}@Overridepublic void setKeyContextElement(StreamRecord<String> record) throws Exception {System.out.println("StreamRecord..............");OneInputStreamOperator.super.setKeyContextElement(record);}
}

keyBy按key分区(DataStream → KeyedStream)

package com.lyj.sx.flink.day03;import org.apache.flink.api.common.functions.MapFunction;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;public class KeyByDemo1 {public static void main(String[] args) throws Exception {StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();DataStreamSource<String> source = env.socketTextStream("192.168.25.62", 8888);MapFunction<String, Tuple2<String, Integer>> mapFunction = new MapFunction<String, Tuple2<String, Integer>>() {Tuple2<String, Integer> t;@Overridepublic Tuple2<String, Integer> map(String s) throws Exception {String[] strings = s.split(" ");for (String string : strings) {t = Tuple2.of(string, 1);}return t;}};source.map(mapFunction).print();env.execute("pxj");}
}
 package com.lyj.sx.flink.day03;import org.apache.flink.api.common.functions.MapFunction;
import org.apache.flink.api.java.tuple.Tuple;
import org.apache.flink.api.java.tuple.Tuple3;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.datastream.KeyedStream;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;public class KeyByDemo2 {public static void main(String[] args) throws  Exception {StreamExecutionEnvironment env= StreamExecutionEnvironment.createLocalEnvironmentWithWebUI(new Configuration());DataStreamSource<String> source = env.socketTextStream("192.168.25.62", 8886);SingleOutputStreamOperator<Tuple3<String, String, Double>> tpStream = source.map(new MapFunction<String, Tuple3<String, String, Double>>() {@Overridepublic Tuple3<String, String, Double> map(String s) throws Exception {String[] fields = s.split(",");return Tuple3.of(fields[0], fields[1], Double.parseDouble(fields[2]));}});KeyedStream<Tuple3<String, String, Double>, Tuple> tuple3TupleKeyedStream = tpStream.keyBy("f0", "f1");tuple3TupleKeyedStream.print();env.execute("pxj");}
}
package com.lyj.sx.flink.day03;import org.apache.flink.api.common.functions.MapFunction;
import org.apache.flink.api.java.functions.KeySelector;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.datastream.KeyedStream;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;public class KeyByDemo3 {public static void main(String[] args)throws Exception {StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironmentWithWebUI(new Configuration());DataStreamSource<String> source = env.socketTextStream("192.168.25.62", 8886);SingleOutputStreamOperator<Tuple2<String, Integer>> map = source.map(new MapFunction<String, Tuple2<String, Integer>>() {Tuple2<String,Integer> t;@Overridepublic Tuple2<String, Integer> map(String s) throws Exception {for (String string : s.split(",")) {t=Tuple2.of(string,1);}return t;}});map.print();KeyedStream<Tuple2<String, Integer>, String> keyedStream = map.keyBy(new KeySelector<Tuple2<String, Integer>, String>() {@Overridepublic String getKey(Tuple2<String, Integer> value) throws Exception {return value.f0;}});keyedStream.print();env.execute("pxj");}
}
package com.lyj.sx.flink.day03;import org.apache.flink.api.common.functions.MapFunction;
import org.apache.flink.api.java.functions.KeySelector;
import org.apache.flink.api.java.tuple.Tuple3;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;public class KeyByDemo4 {public static void main(String[] args) throws Exception {StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironmentWithWebUI(new Configuration());DataStreamSource<String> source = env.socketTextStream("192.168.25.62", 8886);SingleOutputStreamOperator<Tuple3<String, String, Double>> mapped = source.map(new MapFunction<String, Tuple3<String, String, Double>>() {@Overridepublic Tuple3<String, String, Double> map(String s) throws Exception {String[] fields = s.split(",");return Tuple3.of(fields[0], fields[1], Double.parseDouble(fields[2]));}});mapped.keyBy(new KeySelector<Tuple3<String, String, Double>, String>() {@Overridepublic String getKey(Tuple3<String, String, Double> key) throws Exception {return key.f0+key.f1;}}).print();env.execute("pxj");}
}
 package com.lyj.sx.flink.day03;import org.apache.flink.api.common.functions.MapFunction;
import org.apache.flink.api.common.typeinfo.TypeHint;
import org.apache.flink.api.common.typeinfo.TypeInformation;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.api.java.tuple.Tuple3;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.datastream.KeyedStream;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;public class KeyByDemo5 {public static void main(String[] args) throws Exception {StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironmentWithWebUI(new Configuration());DataStreamSource<String> source = env.socketTextStream("192.168.25.62", 8886);//山东省,济南市,5000SingleOutputStreamOperator<Tuple3<String, String, Double>> map = source.map(new MapFunction<String, Tuple3<String, String, Double>>() {@Overridepublic Tuple3<String, String, Double> map(String s) throws Exception {String[] fields = s.split(",");return Tuple3.of(fields[0], fields[1], Double.parseDouble(fields[2]));}});KeyedStream<Tuple3<String, String, Double>, Tuple2<String, String>> tuple3Tuple2KeyedStream = map.keyBy(t -> Tuple2.of(t.f0, t.f1), TypeInformation.of(new TypeHint<Tuple2<String, String>>() {}));tuple3Tuple2KeyedStream.print();env.execute("pxj");}
}
 package com.lyj.sx.flink.day03;import com.lyj.sx.flink.bean.DataBean;
import org.apache.flink.api.common.functions.MapFunction;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.datastream.KeyedStream;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;import java.beans.beancontext.BeanContext;public class KeyByDemo6 {public static void main(String[] args) throws Exception {StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironmentWithWebUI(new Configuration());DataStreamSource<String> source = env.socketTextStream("192.168.25.62", 8886);SingleOutputStreamOperator<DataBean> beanStream = source.map(new MapFunction<String, DataBean>() {@Overridepublic DataBean map(String s) throws Exception {String[] fields = s.split(",");return DataBean.of(fields[0], fields[1], Double.parseDouble(fields[2]));}});KeyedStream<DataBean, DataBean> keyedStream = beanStream.keyBy(t -> t);keyedStream.print();env.execute("pxj");}
}

filter过滤(DataStream → DataStream)

- 该方法是将一个DataStream调用filter方法返回一个新的DataStream,本质上是将该DataStream中的对应的每一条输入数据依次迭代出来,应用filter方法传入的过滤逻辑,返回一个新的DataStream。原来的DataStream中输入的一条数据经过fliter方法传入的过滤逻辑后,返回true就会保留这条数据,返回false就会过滤掉该数据。
 package com.lyj.sx.flink.day03;import org.apache.flink.api.common.functions.FilterFunction;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;public class FilterDemo1 {public static void main(String[] args) throws Exception {StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironmentWithWebUI(new Configuration());DataStreamSource<String> source = env.socketTextStream("192.168.25.62", 8886);SingleOutputStreamOperator<String> filter = source.filter(new FilterFunction<String>() {@Overridepublic boolean filter(String s) throws Exception {return s.startsWith("h");}}).setParallelism(2);filter.print();env.execute("pxj");}
}
 package com.lyj.sx.flink.day03;import org.apache.flink.api.common.typeinfo.TypeInformation;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.operators.StreamFilter;public class FilterDemo2 {public static void main(String[] args) throws Exception {StreamExecutionEnvironment env= StreamExecutionEnvironment.createLocalEnvironmentWithWebUI(new Configuration());DataStreamSource<String> source = env.socketTextStream("192.168.25.62", 8886);SingleOutputStreamOperator<String> transform = source.transform("myfilter", TypeInformation.of(String.class), new StreamFilter<>(w -> w.startsWith("h")));transform.print();env.execute("pxj");}
}
package com.lyj.sx.flink.day03;import org.apache.flink.api.common.typeinfo.TypeInformation;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;public class FilterDemo3 {public static void main(String[] args) throws Exception {StreamExecutionEnvironment env= StreamExecutionEnvironment.createLocalEnvironmentWithWebUI(new Configuration());DataStreamSource<String> source = env.socketTextStream("192.168.25.62", 8886);SingleOutputStreamOperator<String> transform = source.transform("MyFilterFunction", TypeInformation.of(String.class), new MyFilterFunction());transform.print();env.execute("pxj");}
}
 package com.lyj.sx.flink.day03;import org.apache.flink.streaming.api.operators.AbstractStreamOperator;
import org.apache.flink.streaming.api.operators.OneInputStreamOperator;
import org.apache.flink.streaming.runtime.streamrecord.StreamRecord;public class MyFilterFunction extends AbstractStreamOperator<String> implements OneInputStreamOperator<String, String> {@Overridepublic void processElement(StreamRecord<String> element) throws Exception {String elementValue = element.getValue();if(elementValue.startsWith("h")){output.collect(element);}}@Overridepublic void setKeyContextElement(StreamRecord<String> record) throws Exception {OneInputStreamOperator.super.setKeyContextElement(record);System.out.println("setKeyContextElement.........");}
}

整理人:pxj_sx(潘陈)
日 期:2024-06-02 16:06:42

这篇关于flink Transformation算子(更新中)的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/1025777

相关文章

poj3468(线段树成段更新模板题)

题意:包括两个操作:1、将[a.b]上的数字加上v;2、查询区间[a,b]上的和 下面的介绍是下解题思路: 首先介绍  lazy-tag思想:用一个变量记录每一个线段树节点的变化值,当这部分线段的一致性被破坏我们就将这个变化值传递给子区间,大大增加了线段树的效率。 比如现在需要对[a,b]区间值进行加c操作,那么就从根节点[1,n]开始调用update函数进行操作,如果刚好执行到一个子节点,

hdu1394(线段树点更新的应用)

题意:求一个序列经过一定的操作得到的序列的最小逆序数 这题会用到逆序数的一个性质,在0到n-1这些数字组成的乱序排列,将第一个数字A移到最后一位,得到的逆序数为res-a+(n-a-1) 知道上面的知识点后,可以用暴力来解 代码如下: #include<iostream>#include<algorithm>#include<cstring>#include<stack>#in

hdu1689(线段树成段更新)

两种操作:1、set区间[a,b]上数字为v;2、查询[ 1 , n ]上的sum 代码如下: #include<iostream>#include<algorithm>#include<cstring>#include<stack>#include<queue>#include<set>#include<map>#include<stdio.h>#include<stdl

hdu 1754 I Hate It(线段树,单点更新,区间最值)

题意是求一个线段中的最大数。 线段树的模板题,试用了一下交大的模板。效率有点略低。 代码: #include <stdio.h>#include <string.h>#define TREE_SIZE (1 << (20))//const int TREE_SIZE = 200000 + 10;int max(int a, int b){return a > b ? a :

AI行业应用(不定期更新)

ChatPDF 可以让你上传一个 PDF 文件,然后针对这个 PDF 进行小结和提问。你可以把各种各样你要研究的分析报告交给它,快速获取到想要知道的信息。https://www.chatpdf.com/

GIS图形库更新2024.8.4-9.9

更多精彩内容请访问 dt.sim3d.cn ,关注公众号【sky的数孪技术】,技术交流、源码下载请添加微信:digital_twin123 Cesium 本期发布了1.121 版本。重大新闻,Cesium被Bentley收购。 ✨ 功能和改进 默认启用 MSAA,采样 4 次。若要关闭 MSAA,则可以设置scene.msaaSamples = 1。但是通过比较,发现并没有多大改善。

JavaFX应用更新检测功能(在线自动更新方案)

JavaFX开发的桌面应用属于C端,一般来说需要版本检测和自动更新功能,这里记录一下一种版本检测和自动更新的方法。 1. 整体方案 JavaFX.应用版本检测、自动更新主要涉及一下步骤: 读取本地应用版本拉取远程版本并比较两个版本如果需要升级,那么拉取更新历史弹出升级控制窗口用户选择升级时,拉取升级包解压,重启应用用户选择忽略时,本地版本标志为忽略版本用户选择取消时,隐藏升级控制窗口 2.

记录每次更新到仓库 —— Git 学习笔记 10

记录每次更新到仓库 文章目录 文件的状态三个区域检查当前文件状态跟踪新文件取消跟踪(un-tracking)文件重新跟踪(re-tracking)文件暂存已修改文件忽略某些文件查看已暂存和未暂存的修改提交更新跳过暂存区删除文件移动文件参考资料 咱们接着很多天以前的 取得Git仓库 这篇文章继续说。 文件的状态 不管是通过哪种方法,现在我们已经有了一个仓库,并从这个仓

消除安卓SDK更新时的“https://dl-ssl.google.com refused”异常的方法

消除安卓SDK更新时的“https://dl-ssl.google.com refused”异常的方法   消除安卓SDK更新时的“https://dl-ssl.google.com refused”异常的方法 [转载]原地址:http://blog.csdn.net/x605940745/article/details/17911115 消除SDK更新时的“

Flink任务重启策略

概述 Flink支持不同的重启策略,以在故障发生时控制作业如何重启集群在启动时会伴随一个默认的重启策略,在没有定义具体重启策略时会使用该默认策略。如果在工作提交时指定了一个重启策略,该策略会覆盖集群的默认策略默认的重启策略可以通过 Flink 的配置文件 flink-conf.yaml 指定。配置参数 restart-strategy 定义了哪个策略被使用。常用的重启策略: 固定间隔 (Fixe