Kafka-Spark Streaming 异常: dead for group td_topic_advert_impress_blacklist

2024-05-03 06:18

本文主要是介绍Kafka-Spark Streaming 异常: dead for group td_topic_advert_impress_blacklist,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

 

最近在编写Spark Streaming 作业的时候,遇到了一个比较奇怪的问题,表现如下:

 

在本地连接Kafka 集群执行作业:

18/10/31 17:42:58 INFO AbstractCoordinator: Discovered coordinator kafka1:9092 (id: 2147483574 rack: null) for group td_topic_advert_impress_blacklist.
18/10/31 17:43:00 INFO AbstractCoordinator: Marking the coordinator kafka1:9092 (id: 2147483574 rack: null) dead for group td_topic_advert_impress_blacklist

18/10/31 17:42:58 INFO AbstractCoordinator: Discovered coordinator kafka1:9092 (id: 2147483574 rack: null) for group td_topic_advert_impress_blacklist.
18/10/31 17:43:00 INFO AbstractCoordinator: Marking the coordinator kafka1:9092 (id: 2147483574 rack: null) dead for group td_topic_advert_impress_blacklist

 

仅从日志我们极难定位问题的原因。经过了一大堆查找,baidu, github..

我将spark-streaming 的日志级别 从 INFO 换成了 DEBUG 终于找到了问题的原因 ,

 

设置日志级别的代码如下

        JavaSparkContext javaSparkContext = new JavaSparkContext(sparkConf);javaSparkContext.setLogLevel("DEBUG");

 

出现的问题:

18/10/31 17:40:08 DEBUG KafkaConsumer: Starting the Kafka consumer
18/10/31 17:40:08 DEBUG Metadata: Updated cluster metadata version 1 to Cluster(id = null, nodes = [10.170.0.8:9092 (id: -3 rack: null), 10.170.0.7:9092 (id: -2 rack: null), 10.170.0.6:9092 (id: -1 rack: null)], partitions = [])
18/10/31 17:40:08 DEBUG Metrics: Added sensor with name fetch-throttle-time
18/10/31 17:40:08 DEBUG Metrics: Added sensor with name connections-closed:
18/10/31 17:40:08 DEBUG Metrics: Added sensor with name connections-created:
18/10/31 17:40:08 DEBUG Metrics: Added sensor with name bytes-sent-received:
18/10/31 17:40:08 DEBUG Metrics: Added sensor with name bytes-sent:
18/10/31 17:40:08 DEBUG Metrics: Added sensor with name bytes-received:
18/10/31 17:40:08 DEBUG Metrics: Added sensor with name select-time:
18/10/31 17:40:08 DEBUG Metrics: Added sensor with name io-time:
18/10/31 17:40:08 DEBUG Metrics: Added sensor with name heartbeat-latency
18/10/31 17:40:08 DEBUG Metrics: Added sensor with name join-latency
18/10/31 17:40:08 DEBUG Metrics: Added sensor with name sync-latency
18/10/31 17:40:08 DEBUG Metrics: Added sensor with name commit-latency
18/10/31 17:40:08 DEBUG Metrics: Added sensor with name bytes-fetched
18/10/31 17:40:08 DEBUG Metrics: Added sensor with name records-fetched
18/10/31 17:40:08 DEBUG Metrics: Added sensor with name fetch-latency
18/10/31 17:40:08 DEBUG Metrics: Added sensor with name records-lag
18/10/31 17:40:08 INFO AppInfoParser: Kafka version : 0.11.0.0
18/10/31 17:40:08 INFO AppInfoParser: Kafka commitId : cb8625948210849f
18/10/31 17:40:08 DEBUG KafkaConsumer: Kafka consumer created
18/10/31 17:40:08 DEBUG KafkaConsumer: Subscribed to topic(s): topic_advert_impress
18/10/31 17:40:08 DEBUG AbstractCoordinator: Sending GroupCoordinator request for group td_topic_advert_impress_blacklist to broker 10.170.0.6:9092 (id: -1 rack: null)
18/10/31 17:40:08 DEBUG NetworkClient: Initiating connection to node -1 at 10.170.0.6:9092.
18/10/31 17:40:08 DEBUG Metrics: Added sensor with name node--1.bytes-sent
18/10/31 17:40:08 DEBUG Metrics: Added sensor with name node--1.bytes-received
18/10/31 17:40:08 DEBUG Metrics: Added sensor with name node--1.latency
18/10/31 17:40:08 DEBUG Selector: Created socket with SO_RCVBUF = 65536, SO_SNDBUF = 131072, SO_TIMEOUT = 0 to node -1
18/10/31 17:40:08 DEBUG NetworkClient: Completed connection to node -1.  Fetching API versions.
18/10/31 17:40:08 DEBUG NetworkClient: Initiating API versions fetch from node -1.
18/10/31 17:40:08 DEBUG NetworkClient: Initialize connection to node -3 for sending metadata request
18/10/31 17:40:08 DEBUG NetworkClient: Initiating connection to node -3 at 10.170.0.8:9092.
18/10/31 17:40:08 DEBUG NetworkClient: Recorded API versions for node -1: (Produce(0): 0 to 3 [usable: 3], Fetch(1): 0 to 5 [usable: 5], Offsets(2): 0 to 2 [usable: 2], Metadata(3): 0 to 4 [usable: 4], LeaderAndIsr(4): 0 [usable: 0], StopReplica(5): 0 [usable: 0], UpdateMetadata(6): 0 to 3 [usable: 3], ControlledShutdown(7): 1 [usable: 1], OffsetCommit(8): 0 to 3 [usable: 3], OffsetFetch(9): 0 to 3 [usable: 3], FindCoordinator(10): 0 to 1 [usable: 1], JoinGroup(11): 0 to 2 [usable: 2], Heartbeat(12): 0 to 1 [usable: 1], LeaveGroup(13): 0 to 1 [usable: 1], SyncGroup(14): 0 to 1 [usable: 1], DescribeGroups(15): 0 to 1 [usable: 1], ListGroups(16): 0 to 1 [usable: 1], SaslHandshake(17): 0 [usable: 0], ApiVersions(18): 0 to 1 [usable: 1], CreateTopics(19): 0 to 2 [usable: 2], DeleteTopics(20): 0 to 1 [usable: 1], DeleteRecords(21): 0 [usable: 0], InitProducerId(22): 0 [usable: 0], OffsetForLeaderEpoch(23): 0 [usable: 0], AddPartitionsToTxn(24): 0 [usable: 0], AddOffsetsToTxn(25): 0 [usable: 0], EndTxn(26): 0 [usable: 0], WriteTxnMarkers(27): 0 [usable: 0], TxnOffsetCommit(28): 0 [usable: 0], DescribeAcls(29): 0 [usable: 0], CreateAcls(30): 0 [usable: 0], DeleteAcls(31): 0 [usable: 0], DescribeConfigs(32): 0 [usable: 0], AlterConfigs(33): 0 [usable: 0])
18/10/31 17:40:08 DEBUG NetworkClient: Sending metadata request (type=MetadataRequest, topics=topic_advert_impress) to node -1
18/10/31 17:40:08 DEBUG Metrics: Added sensor with name node--3.bytes-sent
18/10/31 17:40:08 DEBUG Metrics: Added sensor with name node--3.bytes-received
18/10/31 17:40:08 DEBUG Metrics: Added sensor with name node--3.latency
18/10/31 17:40:08 DEBUG Selector: Created socket with SO_RCVBUF = 65536, SO_SNDBUF = 131072, SO_TIMEOUT = 0 to node -3
18/10/31 17:40:08 DEBUG NetworkClient: Completed connection to node -3.  Fetching API versions.
18/10/31 17:40:08 DEBUG NetworkClient: Initiating API versions fetch from node -3.
18/10/31 17:40:08 DEBUG Metadata: Updated cluster metadata version 2 to Cluster(id = -dsyi_nCTMGfFcaeYx-9Wg, nodes = [kafka3:9092 (id: 72 rack: null), kafka2:9092 (id: 71 rack: null), kafka1:9092 (id: 73 rack: null)], partitions = [Partition(topic = topic_advert_impress, partition = 3, leader = 72, replicas = [72,71,73], isr = [71,72,73]), Partition(topic = topic_advert_impress, partition = 2, leader = 71, replicas = [71,72,73], isr = [71,72,73]), Partition(topic = topic_advert_impress, partition = 5, leader = 71, replicas = [71,73,72], isr = [71,72,73]), Partition(topic = topic_advert_impress, partition = 4, leader = 73, replicas = [73,72,71], isr = [71,72,73]), Partition(topic = topic_advert_impress, partition = 1, leader = 73, replicas = [73,71,72], isr = [71,72,73]), Partition(topic = topic_advert_impress, partition = 0, leader = 72, replicas = [72,73,71], isr = [71,72,73]), Partition(topic = topic_advert_impress, partition = 7, leader = 73, replicas = [73,71,72], isr = [71,72,73]), Partition(topic = topic_advert_impress, partition = 6, leader = 72, replicas = [72,73,71], isr = [71,72,73]), Partition(topic = topic_advert_impress, partition = 8, leader = 71, replicas = [71,72,73], isr = [71,72,73])])
18/10/31 17:40:08 DEBUG NetworkClient: Recorded API versions for node -3: (Produce(0): 0 to 3 [usable: 3], Fetch(1): 0 to 5 [usable: 5], Offsets(2): 0 to 2 [usable: 2], Metadata(3): 0 to 4 [usable: 4], LeaderAndIsr(4): 0 [usable: 0], StopReplica(5): 0 [usable: 0], UpdateMetadata(6): 0 to 3 [usable: 3], ControlledShutdown(7): 1 [usable: 1], OffsetCommit(8): 0 to 3 [usable: 3], OffsetFetch(9): 0 to 3 [usable: 3], FindCoordinator(10): 0 to 1 [usable: 1], JoinGroup(11): 0 to 2 [usable: 2], Heartbeat(12): 0 to 1 [usable: 1], LeaveGroup(13): 0 to 1 [usable: 1], SyncGroup(14): 0 to 1 [usable: 1], DescribeGroups(15): 0 to 1 [usable: 1], ListGroups(16): 0 to 1 [usable: 1], SaslHandshake(17): 0 [usable: 0], ApiVersions(18): 0 to 1 [usable: 1], CreateTopics(19): 0 to 2 [usable: 2], DeleteTopics(20): 0 to 1 [usable: 1], DeleteRecords(21): 0 [usable: 0], InitProducerId(22): 0 [usable: 0], OffsetForLeaderEpoch(23): 0 [usable: 0], AddPartitionsToTxn(24): 0 [usable: 0], AddOffsetsToTxn(25): 0 [usable: 0], EndTxn(26): 0 [usable: 0], WriteTxnMarkers(27): 0 [usable: 0], TxnOffsetCommit(28): 0 [usable: 0], DescribeAcls(29): 0 [usable: 0], CreateAcls(30): 0 [usable: 0], DeleteAcls(31): 0 [usable: 0], DescribeConfigs(32): 0 [usable: 0], AlterConfigs(33): 0 [usable: 0])
18/10/31 17:40:08 DEBUG AbstractCoordinator: Received GroupCoordinator response ClientResponse(receivedTimeMs=1540978808909, latencyMs=125, disconnected=false, requestHeader={api_key=10,api_version=1,correlation_id=0,client_id=consumer-1}, responseBody=FindCoordinatorResponse(throttleTimeMs=0, errorMessage='null', error=NONE, node=kafka1:9092 (id: 73 rack: null))) for group td_topic_advert_impress_blacklist
18/10/31 17:40:08 INFO AbstractCoordinator: Discovered coordinator kafka1:9092 (id: 2147483574 rack: null) for group td_topic_advert_impress_blacklist.
18/10/31 17:40:08 DEBUG NetworkClient: Initiating connection to node 2147483574 at kafka1:9092.
18/10/31 17:40:20 DEBUG NetworkClient: Error connecting to node 2147483574 at kafka1:9092:
java.io.IOException: Can't resolve address: kafka1:9092at org.apache.kafka.common.network.Selector.connect(Selector.java:195)at org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:762)at org.apache.kafka.clients.NetworkClient.ready(NetworkClient.java:224)at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.tryConnect(ConsumerNetworkClient.java:462)at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$GroupCoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:598)at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$GroupCoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:579)at org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:204)at org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:167)at org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:127)at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.fireCompletion(ConsumerNetworkClient.java:488)at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.firePendingCompletedRequests(ConsumerNetworkClient.java:348)at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:262)at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:208)at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:184)at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:214)at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:200)at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:286)at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1078)at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1043)at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.paranoidPoll(DirectKafkaInputDStream.scala:165)at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.start(DirectKafkaInputDStream.scala:243)at org.apache.spark.streaming.DStreamGraph$$anonfun$start$7.apply(DStreamGraph.scala:54)at org.apache.spark.streaming.DStreamGraph$$anonfun$start$7.apply(DStreamGraph.scala:54)at scala.collection.parallel.mutable.ParArray$ParArrayIterator.foreach_quick(ParArray.scala:143)at scala.collection.parallel.mutable.ParArray$ParArrayIterator.foreach(ParArray.scala:136)at scala.collection.parallel.ParIterableLike$Foreach.leaf(ParIterableLike.scala:972)at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply$mcV$sp(Tasks.scala:49)at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply(Tasks.scala:48)at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply(Tasks.scala:48)at scala.collection.parallel.Task$class.tryLeaf(Tasks.scala:51)at scala.collection.parallel.ParIterableLike$Foreach.tryLeaf(ParIterableLike.scala:969)at scala.collection.parallel.AdaptiveWorkStealingTasks$WrappedTask$class.compute(Tasks.scala:152)at scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$WrappedTask.compute(Tasks.scala:443)at scala.concurrent.forkjoin.RecursiveAction.exec(RecursiveAction.java:160)at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.nio.channels.UnresolvedAddressExceptionat sun.nio.ch.Net.checkAddress(Net.java:123)at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:622)at org.apache.kafka.common.network.Selector.connect(Selector.java:192)... 37 more
18/10/31 17:40:20 INFO AbstractCoordinator: Marking the coordinator kafka1:9092 (id: 2147483574 rack: null) dead for group td_topic_advert_impress_blacklist
18/10/31 17:40:20 DEBUG AbstractCoordinator: Sending GroupCoordinator request for group td_topic_advert_impress_blacklist to broker kafka3:9092 (id: 72 rack: null)

 

这样我们就定位到了最终的问题:

java.io.IOException: Can't resolve address: kafka1:9092at org.apache.kafka.common.network.Selector.connect(Selector.java:195)at org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:762)at org.apache.kafka.clients.NetworkClient.ready(NetworkClient.java:224)at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.tryConnect(Co

 

可以看到就是域名不解析的问题:

我们只需要修改 windows 下的  host 文件即可

 

这篇关于Kafka-Spark Streaming 异常: dead for group td_topic_advert_impress_blacklist的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/956012

相关文章

Java中Springboot集成Kafka实现消息发送和接收功能

《Java中Springboot集成Kafka实现消息发送和接收功能》Kafka是一个高吞吐量的分布式发布-订阅消息系统,主要用于处理大规模数据流,它由生产者、消费者、主题、分区和代理等组件构成,Ka... 目录一、Kafka 简介二、Kafka 功能三、POM依赖四、配置文件五、生产者六、消费者一、Kaf

Kafka拦截器的神奇操作方法

《Kafka拦截器的神奇操作方法》Kafka拦截器是一种强大的机制,用于在消息发送和接收过程中插入自定义逻辑,它们可以用于消息定制、日志记录、监控、业务逻辑集成、性能统计和异常处理等,本文介绍Kafk... 目录前言拦截器的基本概念Kafka 拦截器的定义和基本原理:拦截器是 Kafka 消息传递的不可或缺

SpringBoot操作spark处理hdfs文件的操作方法

《SpringBoot操作spark处理hdfs文件的操作方法》本文介绍了如何使用SpringBoot操作Spark处理HDFS文件,包括导入依赖、配置Spark信息、编写Controller和Ser... 目录SpringBoot操作spark处理hdfs文件1、导入依赖2、配置spark信息3、cont

如何在一台服务器上使用docker运行kafka集群

《如何在一台服务器上使用docker运行kafka集群》文章详细介绍了如何在一台服务器上使用Docker运行Kafka集群,包括拉取镜像、创建网络、启动Kafka容器、检查运行状态、编写启动和关闭脚本... 目录1.拉取镜像2.创建集群之间通信的网络3.将zookeeper加入到网络中4.启动kafka集群

Python中异常类型ValueError使用方法与场景

《Python中异常类型ValueError使用方法与场景》:本文主要介绍Python中的ValueError异常类型,它在处理不合适的值时抛出,并提供如何有效使用ValueError的建议,文中... 目录前言什么是 ValueError?什么时候会用到 ValueError?场景 1: 转换数据类型场景

IDEA中的Kafka管理神器详解

《IDEA中的Kafka管理神器详解》这款基于IDEA插件实现的Kafka管理工具,能够在本地IDE环境中直接运行,简化了设置流程,为开发者提供了更加紧密集成、高效且直观的Kafka操作体验... 目录免安装:IDEA中的Kafka管理神器!简介安装必要的插件创建 Kafka 连接第一步:创建连接第二步:选

Spring中Bean有关NullPointerException异常的原因分析

《Spring中Bean有关NullPointerException异常的原因分析》在Spring中使用@Autowired注解注入的bean不能在静态上下文中访问,否则会导致NullPointerE... 目录Spring中Bean有关NullPointerException异常的原因问题描述解决方案总结

Python中的异步:async 和 await以及操作中的事件循环、回调和异常

《Python中的异步:async和await以及操作中的事件循环、回调和异常》在现代编程中,异步操作在处理I/O密集型任务时,可以显著提高程序的性能和响应速度,Python提供了asyn... 目录引言什么是异步操作?python 中的异步编程基础async 和 await 关键字asyncio 模块理论

详解Python中通用工具类与异常处理

《详解Python中通用工具类与异常处理》在Python开发中,编写可重用的工具类和通用的异常处理机制是提高代码质量和开发效率的关键,本文将介绍如何将特定的异常类改写为更通用的ValidationEx... 目录1. 通用异常类:ValidationException2. 通用工具类:Utils3. 示例文

无人叉车3d激光slam多房间建图定位异常处理方案-墙体画线地图切分方案

墙体画线地图切分方案 针对问题:墙体两侧特征混淆误匹配,导致建图和定位偏差,表现为过门跳变、外月台走歪等 ·解决思路:预期的根治方案IGICP需要较长时间完成上线,先使用切分地图的工程化方案,即墙体两侧切分为不同地图,在某一侧只使用该侧地图进行定位 方案思路 切分原理:切分地图基于关键帧位置,而非点云。 理论基础:光照是直线的,一帧点云必定只能照射到墙的一侧,无法同时照到两侧实践考虑:关