聊聊powerjob的单机线程并发度

2024-03-12 13:36

本文主要是介绍聊聊powerjob的单机线程并发度,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

本文主要研究一下powerjob的单机线程并发度(threadConcurrency)

threadConcurrency

powerjob-worker/src/main/java/tech/powerjob/worker/pojo/model/InstanceInfo.java

@Data
public class InstanceInfo implements Serializable {/*** 基础信息*/private Long jobId;private Long instanceId;private Long wfInstanceId;/*** 任务执行处理器信息*/// 任务执行类型,单机、广播、MRprivate String executeType;// 处理器类型(JavaBean、Jar、脚本等)private String processorType;// 处理器信息private String processorInfo;// 定时类型private int timeExpressionType;/*** 超时时间*/// 整个任务的总体超时时间private long instanceTimeoutMS;/*** 任务运行参数*/// 任务级别的参数,相当于类的static变量private String jobParams;// 实例级别的参数,相当于类的普通变量private String instanceParams;// 每台机器的处理线程数上限private int threadConcurrency;// 子任务重试次数(任务本身的重试机制由server控制)private int taskRetryNum;private String logConfig;
}

InstanceInfo定义了threadConcurrency,即每台机器的处理线程数上限

maxDispatchNum

powerjob-worker/src/main/java/tech/powerjob/worker/core/tracker/task/heavy/HeavyTaskTracker.java

    /*** 定时扫描数据库中的task(出于内存占用量考虑,每次最多获取100个),并将需要执行的任务派发出去*/protected class Dispatcher implements Runnable {// 数据库查询限制,每次最多查询几个任务private static final int DB_QUERY_LIMIT = 100;@Overridepublic void run() {if (finished.get()) {return;}Stopwatch stopwatch = Stopwatch.createStarted();// 1. 获取可以派发任务的 ProcessorTrackerList<String> availablePtIps = ptStatusHolder.getAvailableProcessorTrackers();// 2. 没有可用 ProcessorTracker,本次不派发if (availablePtIps.isEmpty()) {log.debug("[TaskTracker-{}] no available ProcessorTracker now.", instanceId);return;}// 3. 避免大查询,分批派发任务long currentDispatchNum = 0;long maxDispatchNum = availablePtIps.size() * instanceInfo.getThreadConcurrency() * 2L;AtomicInteger index = new AtomicInteger(0);// 4. 循环查询数据库,获取需要派发的任务while (maxDispatchNum > currentDispatchNum) {int dbQueryLimit = Math.min(DB_QUERY_LIMIT, (int) maxDispatchNum);List<TaskDO> needDispatchTasks = taskPersistenceService.getTaskByStatus(instanceId, TaskStatus.WAITING_DISPATCH, dbQueryLimit);currentDispatchNum += needDispatchTasks.size();needDispatchTasks.forEach(task -> {// 获取 ProcessorTracker 地址,如果 Task 中自带了 Address,则使用该 AddressString ptAddress = task.getAddress();if (StringUtils.isEmpty(ptAddress) || RemoteConstant.EMPTY_ADDRESS.equals(ptAddress)) {ptAddress = availablePtIps.get(index.getAndIncrement() % availablePtIps.size());}dispatchTask(task, ptAddress);});// 数量不足 或 查询失败,则终止循环if (needDispatchTasks.size() < dbQueryLimit) {break;}}log.debug("[TaskTracker-{}] dispatched {} tasks,using time {}.", instanceId, currentDispatchNum, stopwatch.stop());}}

这里会计算maxDispatchNum(availablePtIps.size() * instanceInfo.getThreadConcurrency() * 2L),之后通过availablePtIps.get(index.getAndIncrement() % availablePtIps.size())去轮询派发任务

ProcessorTracker

powerjob-worker/src/main/java/tech/powerjob/worker/core/tracker/processor/ProcessorTracker.java

calThreadPoolSize

    private int calThreadPoolSize() {ExecuteType executeType = ExecuteType.valueOf(instanceInfo.getExecuteType());ProcessorType processorType = ProcessorType.valueOf(instanceInfo.getProcessorType());// 脚本类自带线程池,不过为了少一点逻辑判断,还是象征性分配一个线程if (processorType == ProcessorType.PYTHON || processorType == ProcessorType.SHELL) {return 1;}if (executeType == ExecuteType.MAP_REDUCE || executeType == ExecuteType.MAP) {return instanceInfo.getThreadConcurrency();}if (TimeExpressionType.FREQUENT_TYPES.contains(instanceInfo.getTimeExpressionType())) {return instanceInfo.getThreadConcurrency();}return 2;}

ProcessorTracker的calThreadPoolSize方法会根据ProcessorType、ExecuteType、TimeExpressionType来确定线程池大小,比如ProcessorType.PYTHON或者ProcessorType.SHELL返回1,ExecuteType.MAP_REDUCE、ExecuteType.MAP、TimeExpressionType.FREQUENT_TYPES返回的是instanceInfo.greadConcurrency()

initThreadPool

	private static final int THREAD_POOL_QUEUE_MAX_SIZE = 128;private void initThreadPool() {int poolSize = calThreadPoolSize();// 待执行队列,为了防止对内存造成较大压力,内存队列不能太大BlockingQueue<Runnable> queue = new ArrayBlockingQueue<>(THREAD_POOL_QUEUE_MAX_SIZE);// 自定义线程池中线程名称 (PowerJob Processor Pool -> PPP)ThreadFactory threadFactory = new ThreadFactoryBuilder().setNameFormat("PPP-%d").build();// 拒绝策略:直接抛出异常RejectedExecutionHandler rejectionHandler = new ThreadPoolExecutor.AbortPolicy();threadPool = new ThreadPoolExecutor(poolSize, poolSize, 60L, TimeUnit.SECONDS, queue, threadFactory, rejectionHandler);// 当没有任务执行时,允许销毁核心线程(即线程池最终存活线程个数可能为0)threadPool.allowCoreThreadTimeOut(true);}

initThreadPool这里创建了ArrayBlockingQueue,大小为128,RejectedExecutionHandler为AbortPolicy,直接抛出异常RejectedExecutionException

submitTask

    public void submitTask(TaskDO newTask) {// 一旦 ProcessorTracker 出现异常,所有提交到此处的任务直接返回失败,防止形成死锁// 死锁分析:TT创建PT,PT创建失败,无法定期汇报心跳,TT长时间未收到PT心跳,认为PT宕机(确实宕机了),无法选择可用的PT再次派发任务,死锁形成,GG斯密达 T_Tif (lethal) {ProcessorReportTaskStatusReq report = new ProcessorReportTaskStatusReq().setInstanceId(instanceId).setSubInstanceId(newTask.getSubInstanceId()).setTaskId(newTask.getTaskId()).setStatus(TaskStatus.WORKER_PROCESS_FAILED.getValue()).setResult(lethalReason).setReportTime(System.currentTimeMillis());TransportUtils.ptReportTask(report, taskTrackerAddress, workerRuntime);return;}boolean success = false;// 1. 设置值并提交执行newTask.setInstanceId(instanceInfo.getInstanceId());newTask.setAddress(taskTrackerAddress);HeavyProcessorRunnable heavyProcessorRunnable = new HeavyProcessorRunnable(instanceInfo, taskTrackerAddress, newTask, processorBean, omsLogger, statusReportRetryQueue, workerRuntime);try {threadPool.submit(heavyProcessorRunnable);success = true;} catch (RejectedExecutionException ignore) {log.warn("[ProcessorTracker-{}] submit task(taskId={},taskName={}) to ThreadPool failed due to ThreadPool has too much task waiting to process, this task will dispatch to other ProcessorTracker.",instanceId, newTask.getTaskId(), newTask.getTaskName());} catch (Exception e) {log.error("[ProcessorTracker-{}] submit task(taskId={},taskName={}) to ThreadPool failed.", instanceId, newTask.getTaskId(), newTask.getTaskName(), e);}// 2. 回复接收成功if (success) {ProcessorReportTaskStatusReq reportReq = new ProcessorReportTaskStatusReq();reportReq.setInstanceId(instanceId);reportReq.setSubInstanceId(newTask.getSubInstanceId());reportReq.setTaskId(newTask.getTaskId());reportReq.setStatus(TaskStatus.WORKER_RECEIVED.getValue());reportReq.setReportTime(System.currentTimeMillis());TransportUtils.ptReportTask(reportReq, taskTrackerAddress, workerRuntime);log.debug("[ProcessorTracker-{}] submit task(taskId={}, taskName={}) success, current queue size: {}.",instanceId, newTask.getTaskId(), newTask.getTaskName(), threadPool.getQueue().size());}}

submitTask这里会根据TaskDO创建HeavyProcessorRunnable,然后提交到threadPool,若有异常则success为false,只有成功了才会创建ProcessorReportTaskStatusReq,回复接收任务成功。若有RejectedExecutionException则会打印warn日志[ProcessorTracker-{}] submit task(taskId={},taskName={}) to ThreadPool failed due to ThreadPool has too much task waiting to process, this task will dispatch to other ProcessorTracker.

onReceiveProcessorReportTaskStatusReq

powerjob-worker/src/main/java/tech/powerjob/worker/actors/TaskTrackerActor.java

    @Handler(path = WTT_HANDLER_REPORT_TASK_STATUS)public AskResponse onReceiveProcessorReportTaskStatusReq(ProcessorReportTaskStatusReq req) {int taskStatus = req.getStatus();// 只有重量级任务才会有两级任务状态上报的机制HeavyTaskTracker taskTracker = HeavyTaskTrackerManager.getTaskTracker(req.getInstanceId());// 手动停止 TaskTracker 的情况下会出现这种情况if (taskTracker == null) {log.warn("[TaskTrackerActor] receive ProcessorReportTaskStatusReq({}) but system can't find TaskTracker.", req);return null;}if (ProcessorReportTaskStatusReq.BROADCAST.equals(req.getCmd())) {taskTracker.broadcast(taskStatus == TaskStatus.WORKER_PROCESS_SUCCESS.getValue(), req.getSubInstanceId(), req.getTaskId(), req.getResult());}taskTracker.updateTaskStatus(req.getSubInstanceId(), req.getTaskId(), taskStatus, req.getReportTime(), req.getResult());// 更新工作流上下文taskTracker.updateAppendedWfContext(req.getAppendedWfContext());// 结束状态需要回复接受成功if (TaskStatus.FINISHED_STATUS.contains(taskStatus)) {return AskResponse.succeed(null);}return null;}

TaskTrackerActor接收到ProcessorReportTaskStatusReq,会通过updateTaskStatus更新状态,如果是FINISHED_STATUS状态则回复接收成功AskResponse.succeed(null)

TaskStatus

powerjob-worker/src/main/java/tech/powerjob/worker/common/constants/TaskStatus.java

@Getter
@AllArgsConstructor
public enum TaskStatus {WAITING_DISPATCH(1, "等待调度器调度"),DISPATCH_SUCCESS_WORKER_UNCHECK(2, "调度成功(但不保证worker收到)"),WORKER_RECEIVED(3, "worker接收成功,但未开始执行"),WORKER_PROCESSING(4, "worker正在执行"),WORKER_PROCESS_FAILED(5, "worker执行失败"),WORKER_PROCESS_SUCCESS(6, "worker执行成功");public static final Set<Integer> FINISHED_STATUS = Sets.newHashSet(WORKER_PROCESS_FAILED.value, WORKER_PROCESS_SUCCESS.value);private final int value;private final String des;public static TaskStatus of(int v) {for (TaskStatus taskStatus : values()) {if (v == taskStatus.value) {return taskStatus;}}throw new IllegalArgumentException("no TaskStatus match the value of " + v);}
}

task_info表中的status一共有等待调度WAITING_DISPATCH、调度DISPATCH_SUCCESS_WORKER_UNCHECK、worker接收成功WORKER_RECEIVED、worker处理中WORKER_PROCESSING、worker处理失败WORKER_PROCESS_FAILED、worker处理成功WORKER_PROCESS_SUCCESS这几个状态,其中处理成功和处理失败为完结状态

HeavyProcessorRunnable

powerjob-worker/src/main/java/tech/powerjob/worker/core/processor/runnable/HeavyProcessorRunnable.java

    public void run() {// 切换线程上下文类加载器(否则用的是 Worker 类加载器,不存在容器类,在序列化/反序列化时会报 ClassNotFoundException)Thread.currentThread().setContextClassLoader(processorBean.getClassLoader());try {innerRun();} catch (InterruptedException ignore) {// ignore} catch (Throwable e) {reportStatus(TaskStatus.WORKER_PROCESS_FAILED, e.toString(), null, null);log.error("[ProcessorRunnable-{}] execute failed, please contact the author(@KFCFans) to fix the bug!", task.getInstanceId(), e);} finally {ThreadLocalStore.clear();}}public void innerRun() throws InterruptedException {final BasicProcessor processor = processorBean.getProcessor();String taskId = task.getTaskId();Long instanceId = task.getInstanceId();log.debug("[ProcessorRunnable-{}] start to run task(taskId={}&taskName={})", instanceId, taskId, task.getTaskName());ThreadLocalStore.setTask(task);ThreadLocalStore.setRuntimeMeta(workerRuntime);// 0. 构造任务上下文WorkflowContext workflowContext = constructWorkflowContext();TaskContext taskContext = constructTaskContext();taskContext.setWorkflowContext(workflowContext);// 1. 上报执行信息reportStatus(TaskStatus.WORKER_PROCESSING, null, null, null);ProcessResult processResult;ExecuteType executeType = ExecuteType.valueOf(instanceInfo.getExecuteType());// 2. 根任务 & 广播执行 特殊处理if (TaskConstant.ROOT_TASK_NAME.equals(task.getTaskName()) && executeType == ExecuteType.BROADCAST) {// 广播执行:先选本机执行 preProcess,完成后 TaskTracker 再为所有 Worker 生成子 TaskhandleBroadcastRootTask(instanceId, taskContext);return;}// 3. 最终任务特殊处理(一定和 TaskTracker 处于相同的机器)if (TaskConstant.LAST_TASK_NAME.equals(task.getTaskName())) {handleLastTask(taskId, instanceId, taskContext, executeType);return;}// 4. 正式提交运行try {processResult = processor.process(taskContext);if (processResult == null) {processResult = new ProcessResult(false, "ProcessResult can't be null");}} catch (Throwable e) {log.warn("[ProcessorRunnable-{}] task(id={},name={}) process failed.", instanceId, taskContext.getTaskId(), taskContext.getTaskName(), e);processResult = new ProcessResult(false, e.toString());}reportStatus(processResult.isSuccess() ? TaskStatus.WORKER_PROCESS_SUCCESS : TaskStatus.WORKER_PROCESS_FAILED, suit(processResult.getMsg()), null, workflowContext.getAppendedContextData());}

HeavyProcessorRunnable的run方法委派给了innerRun,它捕获Throwable异常然后上报为WORKER_PROCESS_FAILED状态;innerRun方法在被执行时,先上报状态为WORKER_PROCESSING,之后回调processor.process进行处理,若处理成功则上报WORKER_PROCESS_SUCCESS,否则上报WORKER_PROCESS_FAILED

小结

powerjob的InstanceInfo定义了threadConcurrency,即每台机器的处理线程数上限

  • HeavyTaskTracker会计算maxDispatchNum(availablePtIps.size() * instanceInfo.getThreadConcurrency() * 2L),之后通过availablePtIps.get(index.getAndIncrement() % availablePtIps.size())去轮询派发任务
  • ProcessorTracker的calThreadPoolSize方法会根据ProcessorType、ExecuteType、TimeExpressionType来确定线程池大小,比如ProcessorType.PYTHON或者ProcessorType.SHELL返回1,ExecuteType.MAP_REDUCE、ExecuteType.MAP、TimeExpressionType.FREQUENT_TYPES返回的是instanceInfo.greadConcurrency();initThreadPool这里创建了ArrayBlockingQueue,大小为128,RejectedExecutionHandler为AbortPolicy,直接抛出异常RejectedExecutionException;submitTask这里会根据TaskDO创建HeavyProcessorRunnable,然后提交到threadPool,若有异常则success为false,只有成功了才会创建ProcessorReportTaskStatusReq,回复接收任务成功
  • TaskTrackerActor接收到ProcessorReportTaskStatusReq,会通过updateTaskStatus更新状态,如果是FINISHED_STATUS状态则回复接收成功AskResponse.succeed(null)
  • HeavyProcessorRunnable的run方法委派给了innerRun,它捕获Throwable异常然后上报为WORKER_PROCESS_FAILED状态;innerRun方法在被执行时,先上报状态为WORKER_PROCESSING,之后回调processor.process进行处理,若处理成功则上报WORKER_PROCESS_SUCCESS,否则上报WORKER_PROCESS_FAILED

这篇关于聊聊powerjob的单机线程并发度的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/801426

相关文章

高并发环境中保持幂等性

在高并发环境中保持幂等性是一项重要的挑战。幂等性指的是无论操作执行多少次,其效果都是相同的。确保操作的幂等性可以避免重复执行带来的副作用。以下是一些保持幂等性的常用方法: 唯一标识符: 请求唯一标识:在每次请求中引入唯一标识符(如 UUID 或者生成的唯一 ID),在处理请求时,系统可以检查这个标识符是否已经处理过,如果是,则忽略重复请求。幂等键(Idempotency Key):客户端在每次

衡石分析平台使用手册-单机安装及启动

单机安装及启动​ 本文讲述如何在单机环境下进行 HENGSHI SENSE 安装的操作过程。 在安装前请确认网络环境,如果是隔离环境,无法连接互联网时,请先按照 离线环境安装依赖的指导进行依赖包的安装,然后按照本文的指导继续操作。如果网络环境可以连接互联网,请直接按照本文的指导进行安装。 准备工作​ 请参考安装环境文档准备安装环境。 配置用户与安装目录。 在操作前请检查您是否有 sud

Java并发编程之——BlockingQueue(队列)

一、什么是BlockingQueue BlockingQueue即阻塞队列,从阻塞这个词可以看出,在某些情况下对阻塞队列的访问可能会造成阻塞。被阻塞的情况主要有如下两种: 1. 当队列满了的时候进行入队列操作2. 当队列空了的时候进行出队列操作123 因此,当一个线程试图对一个已经满了的队列进行入队列操作时,它将会被阻塞,除非有另一个线程做了出队列操作;同样,当一个线程试图对一个空

线程的四种操作

所属专栏:Java学习        1. 线程的开启 start和run的区别: run:描述了线程要执行的任务,也可以称为线程的入口 start:调用系统函数,真正的在系统内核中创建线程(创建PCB,加入到链表中),此处的start会根据不同的系统,分别调用不同的api,创建好之后的线程,再单独去执行run(所以说,start的本质是调用系统api,系统的api

java线程深度解析(六)——线程池技术

http://blog.csdn.net/Daybreak1209/article/details/51382604 一种最为简单的线程创建和回收的方法: [html]  view plain copy new Thread(new Runnable(){                @Override               public voi

java线程深度解析(五)——并发模型(生产者-消费者)

http://blog.csdn.net/Daybreak1209/article/details/51378055 三、生产者-消费者模式     在经典的多线程模式中,生产者-消费者为多线程间协作提供了良好的解决方案。基本原理是两类线程,即若干个生产者和若干个消费者,生产者负责提交用户请求任务(到内存缓冲区),消费者线程负责处理任务(从内存缓冲区中取任务进行处理),两类线程之

java线程深度解析(四)——并发模型(Master-Worker)

http://blog.csdn.net/daybreak1209/article/details/51372929 二、Master-worker ——分而治之      Master-worker常用的并行模式之一,核心思想是由两个进程协作工作,master负责接收和分配任务,worker负责处理任务,并把处理结果返回给Master进程,由Master进行汇总,返回给客

java线程深度解析(二)——线程互斥技术与线程间通信

http://blog.csdn.net/daybreak1209/article/details/51307679      在java多线程——线程同步问题中,对于多线程下程序启动时出现的线程安全问题的背景和初步解决方案已经有了详细的介绍。本文将再度深入解析对线程代码块和方法的同步控制和多线程间通信的实例。 一、再现多线程下安全问题 先看开启两条线程,分别按序打印字符串的

java线程深度解析(一)——java new 接口?匿名内部类给你答案

http://blog.csdn.net/daybreak1209/article/details/51305477 一、内部类 1、内部类初识 一般,一个类里主要包含类的方法和属性,但在Java中还提出在类中继续定义类(内部类)的概念。 内部类的定义:类的内部定义类 先来看一个实例 [html]  view plain copy pu

深入解析秒杀业务中的核心问题 —— 从并发控制到事务管理

深入解析秒杀业务中的核心问题 —— 从并发控制到事务管理 秒杀系统是应对高并发、高压力下的典型业务场景,涉及到并发控制、库存管理、事务管理等多个关键技术点。本文将深入剖析秒杀商品业务中常见的几个核心问题,包括 AOP 事务管理、同步锁机制、乐观锁、CAS 操作,以及用户限购策略。通过这些技术的结合,确保秒杀系统在高并发场景下的稳定性和一致性。 1. AOP 代理对象与事务管理 在秒杀商品