Aeron:Aeron Tooling

2024-06-17 20:44
文章标签 tooling aeron

本文主要是介绍Aeron:Aeron Tooling,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

一、Aeron Stat

Aeron Stat 输出来自 Aeron 的关键计数器,以及所有活动流和最近活动流的位置和关键计数器。

要使用 Aeron Stat,您必须提供要检查的Media Driver文件夹,例如,如果您将Media Driver context配置为:

final MediaDriver.Context mediaDriverCtx = new MediaDriver.Context().aeronDirectoryName("/dev/shm/md");

那么提供给 AeronStat 的路径如下:

java -cp aeron-all-*.jar -Daeron.dir=/dev/shm/md io.aeron.samples.AeronStat

输出(查看运行中的Archive Replication Client)

17:03:52 - Aeron Stat (CnC v0.2.0), pid 2771, heartbeat age 451ms
======================================================================
0:               60,704 - Bytes sent
1:              122,848 - Bytes received
2:                    0 - Failed offers to ReceiverProxy
3:                    0 - Failed offers to SenderProxy
4:                    0 - Failed offers to DriverConductorProxy
5:                    0 - NAKs sent
6:                    0 - NAKs received
7:                1,875 - Status Messages sent
8:                  941 - Status Messages received
9:                1,865 - Heartbeats sent
10:                3,610 - Heartbeats received
11:                    0 - Retransmits sent
12:                    0 - Flow control under runs
13:                    0 - Flow control over runs
14:                    0 - Invalid packets
15:                    0 - Errors
16:                    0 - Short sends
17:                    0 - Failed attempts to free log buffers
18:                    0 - Sender flow control limits, i.e. back-pressure events
19:                    0 - Unblocked Publications
20:                    0 - Unblocked Control Commands
21:                    0 - Possible TTL Asymmetry
22:                    0 - ControllableIdleStrategy status
23:                    0 - Loss gap fills
24:                    0 - Client liveness timeouts
25:                    0 - Resolution changes: driverName=null hostname=archive-client
26:          150,858,350 - Conductor max cycle time doing its work in ns: SHARED
27:                    0 - Conductor work cycle exceeded threshold count: threshold=1000000000ns SHARED
28:          149,104,126 - Sender max cycle time doing its work in ns: SHARED
29:                    0 - Sender work cycle exceeded threshold count: threshold=1000000000ns SHARED
30:          149,144,918 - Receiver max cycle time doing its work in ns: SHARED
31:                    0 - Receiver work cycle exceeded threshold count: threshold=1000000000ns SHARED
32:            1,838,850 - NameResolver max time in ns
33:                    0 - NameResolver exceeded threshold count
36:    1,692,637,432,558 - client-heartbeat: 1
52:                    1 - rcv-channel: aeron:udp?term-length=65536|sparse=true|mtu=1408|endpoint=10.1.0.4:0 10.1.0.4:45494
53:                    1 - rcv-local-sockaddr: 52 10.1.0.4:45494
54:                    1 - snd-channel: aeron:udp?term-length=65536|sparse=true|mtu=1408|endpoint=archive-backup:17000 10.1.0.4:33378
55:                    1 - snd-local-sockaddr: 54 10.1.0.4:33378
56:                  448 - pub-pos (sampled): 15 -1436025328 10 aeron:udp?term-length=65536|sparse=true|mtu=1408|endpoint=archive-backup:17000
57:               33,216 - pub-lmt: 15 -1436025328 10 aeron:udp?term-length=65536|sparse=true|mtu=1408|endpoint=archive-backup:17000
58:                  448 - snd-pos: 15 -1436025328 10 aeron:udp?term-length=65536|sparse=true|mtu=1408|endpoint=archive-backup:17000
59:               32,768 - snd-lmt: 15 -1436025328 10 aeron:udp?term-length=65536|sparse=true|mtu=1408|endpoint=archive-backup:17000
60:                    0 - snd-bpe: 15 -1436025328 10 aeron:udp?term-length=65536|sparse=true|mtu=1408|endpoint=archive-backup:17000
61:                  608 - sub-pos: 14 1817141198 20 aeron:udp?term-length=65536|sparse=true|mtu=1408|endpoint=10.1.0.4:0 @0
62:                  608 - rcv-hwm: 17 1817141198 20 aeron:udp?term-length=65536|sparse=true|mtu=1408|endpoint=10.1.0.4:0
63:                  608 - rcv-pos: 17 1817141198 20 aeron:udp?term-length=65536|sparse=true|mtu=1408|endpoint=10.1.0.4:0
64:                    1 - rcv-channel: aeron:udp?endpoint=10.1.0.4:0 10.1.0.4:33933
65:                    1 - rcv-local-sockaddr: 64 10.1.0.4:33933
66:                6,016 - sub-pos: 19 1817141199 200 aeron:udp?endpoint=10.1.0.4:0 @1280
67:                6,016 - rcv-hwm: 21 1817141199 200 aeron:udp?endpoint=10.1.0.4:0
68:                6,016 - rcv-pos: 21 1817141199 200 aeron:udp?endpoint=10.1.0.4:0
--

Core Counters

RowDescription
Top Line这里最重要的数据是hearbeat age - 这是自 cnc.dat 中上一次Media Driver心跳以来所经过的时间。如果这个数字很大(超过 1000 毫秒),请检查Media Driver是否仍在运行
0当前Media Driver通过 UDP 发送的总字节数,不包括 IP headers。如果该数据没有按照应用程序预期的速度增加,则说明出了问题。
1当前Media Driver通过 UDP 接收到的总字节数,不包括 IP headers。如果该数据没有按照应用程序预期的速度增加,则说明出了问题。
2向Media Driver's Receiver Proxy发出的请求失败;这表明存在背压
3向Media Driver's Sender Proxy发出的请求失败;这表明存在背压
4向Media Driver's Conductor Proxy发出的请求失败;这表明存在背压
5发送 NAK 的总数。这是该Media Driver为请求丢失数据包而发送 NAK 的次数。
6收到的 NAK 总数。这是该Media Driver收到 NAK 的次数,以便向远程Media Driver重放丢失的数据包。
7已发送的状态信息(Status Messages sent.)。这是该Media Driver为流量控制而发送的状态信息数量的运行计数。随着时间的推移,该计数应该会增加。
8收到的状态信息(Status Messages received.)。这是该Media Driver接收到的用于流量控制的状态信息数量的运行计数。随着时间的推移,该计数应该会增加。
9已发送心跳(Heartbeats sent.)。这是当没有数据可发送时,该Media Driver为向另一个Media Driver显示有效性而发送的心跳次数。随着时间的推移,该计数应该会增加。
10收到的心跳(Heartbeats received.)。这是当没有数据可发送时,该Media Driver从另一个Media Driver接收到的心跳次数。随着时间的推移,该计数应该会增加。
11已发送的重传。这是该Media Driver因 NAK 消息而发送的数据包重传次数。在一个健康的网络中(以及运行良好的进程中),该值通常为零或很低。(Retransmits sent. This is how many packet retransmits have been sent by this Media Driver as a result of a NAK message. This will typically stay zero or very low in a healthy network (and with well behaved processes).)
12流量控制不足。这是在当前流量控制窗口下运行的数据包计数。(Flow control under runs. This is the count of packets which under-run the current flow control window for Images)
13流量控制超时。这是超过当前流量控制窗口的数据包计数。(Flow control over runs. This is the count of packets which over-run the current flow control window for Images)
14该Media Driver接收到的无效数据包计数(Count of invalid packets received by this Media Driver)
15该Media Driver观察到的错误计数。ErrorStat(见下文)将提供详细信息。(Count of errors observed by this Media Driver. ErrorStat (see below) will provide details.)
16短发送计数。当Media Driver's Sender代理希望通过网络发送给定缓冲区的数据,但套接字没有从缓冲区中获取所有数据时,就会发生短发送。通常情况下,Aeron 会对此进行恢复。当这种情况增加到一个较低的数字后,要解决的问题就会变得复杂,原因可能是缓冲区大小不正确,也可能是网络设备故障。首先要查看的通常是网络缓冲区大小的设置:aeron.socket.so_rcvbufaeron.socket.so_sndbufaeron.rcv.initial.window.length 必须小于或等于 aeron.socket.so_rcvbuf。正确调整大小是一门艺术,在 RTT 差异较大的网络中尤其具有挑战性。另请参阅 Bandwidth Delay Product。注意:您可能需要更新操作系统中的最大套接字缓冲区大小。(Short send count. A short send happens when the Media Driver's Sender agent expects to send a given buffer of data over the network, but the socket did not take all the data from the buffer. Typically, Aeron will recover from this. When this increases beyond a low number, it can be a complex problem to solve with causes ranging from incorrect buffer sizing to network equipment failure. The first place to look is typically the settings for the network buffer sizes: aeron.socket.so_rcvbuf and aeron.socket.so_sndbuf. aeron.rcv.initial.window.length must be less than or equal to aeron.socket.so_rcvbuf. Correct sizing can be an art, and can be especially challenging in a network with a large RTT variance. See also Bandwidth Delay Product. Note: you may need to update maximum socket buffer sizes in your operating system.)
17Media Driver无法释放日志缓冲区的次数(The number of times the Media Driver could not free a log buffer)
18所有流的背压事件总数。See also Back pressure(Total number of back-pressure events over all streams. See also Back pressure)
19客户端在超时时间内commit() or abort() a tryClaim 失败后,publication 被解除阻塞的次数(see Publication TryClaim and Log Buffer Unblocking)。(Count of times a publication has been unblocked after a client failed to commit() or abort() a tryClaim within timeout (see Publication TryClaim and Log Buffer Unblocking))
20客户未能在超时内完成offer后,命令被解除锁定的次数(Count of times a command has been unblocked after a client failed to complete an offer within a timeout)
21通道端点检测到其配置与连接之间可能存在 TTL 不对称的次数(The number of times a channel endpoint detected a possible TTL asymmetry between its config and a connection)
23这是在禁用 NAK 时填补损失缺口的次数(This is the number of times a loss gap has been filled when NAKs have been disabled)
24在未优雅关闭的情况下超时的 Aeron 客户端数量(如该Media Driver的 Aeron 客户端)。(The number of Aeron clients that have timed out without a graceful close (as in Aeron clients of this Media Driver))
25端点重新解析(即名称解析name resolution)导致变更的次数(The number of times the endpoints have been re-resolved (i.e. name resolution) resulting in a change)
26conductor工作周期的最大时间(纳秒)。Found in Aeron 1.33.0+(The maximum time taken in a conductor duty cycle in nanoseconds. Found in Aeron 1.33.0+)
27conductor工作周期时间超过可配置阈值(默认为 1 秒)的次数。Found in Aeron 1.33.0+(The number of times the time spent in a conductor duty cycle exceeded a configurable threshold (1s default). Found in Aeron 1.33.0+)
28sender工作周期的最长时间(纳秒)。(The maximum time taken in a sender duty cycle in nanoseconds.)
29sender工作周期时间超过可配置阈值(默认为 1 秒)的次数。(The number of times the time spent in a sender duty cycle exceeded a configurable threshold (1s default).)
30receiver工作周期的最长时间(纳秒)。(The maximum time taken in a receiver duty cycle in nanoseconds.)
31receiver工作周期时间超过可配置阈值(默认为 1 秒)的次数。(The number of times the time spent in a receiver duty cycle exceeded a configurable threshold (1s default).)
32Name Resolution所需的最长时间(纳秒)。Found in Aeron 1.42.0+(The maximum time taken for Name Resolution in nanoseconds. Found in Aeron 1.42.0+)
33Name Resolution所用时间超过可配置阈值的次数。Found in Aeron 1.42.0+(The number of times the time spent in Name Resolution exceeded a configurable threshold. Found in Aeron 1.42.0+)

Variable Counters

RowDescription
36 in above example; varies来自指定客户端的最后一次客户端心跳的毫秒值。此处的客户端是Media Driver上的 Aeron 客户端。(Epoch millisecond value of the last client heartbeat from the given client. The client in this context is the Aeron Client on the Media Driver.)
52 in above example; variesReceive channel
53 in above example; variesReceive socket address
54 in above example; variesSend channel
55 in above example; variesSend socket address

第 31 至 45 行包含位置值。有关如何理解这些值的更多信息,请参阅 Understanding Aeron Position。带有 @ 的行,如第 32 行中的 sub-pos,指的是订阅的连接位置—在本例中,订阅在位置 0 处连接。

注:Aeron Stat 工具有一个 C 语言版本。它是用 C Media Driver编译和构建的。See C Media Driver.

AeronStat options

ArgDescription
-hShows the help text
watch=true or false如果设置为 true,则每 n 秒刷新一次。如果设置为 false,则运行一次后退出。默认为 true。(If set to true, refreshes every n seconds. If set to false, runs once and exits. Defaults to true.)
delay=seconds指定刷新输出的频率。更新间隔的延迟时间(以秒为单位)。仅当 watch=true 时有效(或未指定 watch)(Specifies how often to refresh the output. Delay in seconds between update. Valid only if watch=true (or watch not specified))
stream={regex}只过滤与 regex 匹配的数据流。例如:stream=101(Filters streams to only those that match the regex. Example: stream=101)
type={regex}筛选输出类型(如计数器类型),只筛选符合以下条件的类型(Filters output type (as in the counter type) to only those that match)
session={regex}Filters sessions to only those that match
channel={regex}Filters channels to only those that match
identity={regex}Filters identity to only those that match

 二、Error Stat

Error Stat 可打印 Aeron 进程中出现的所有错误。与 AeronStat 一样,您必须将 ErrorStat 指向Media Driver目录。

java -cp aeron-all-*.jar -Daeron.dir=/dev/shm/md io.aeron.samples.ErrorStat

当一切按预期运行时,错误统计将产生以下输出:

0 distinct errors observed.

Note: There is a C version of the Error Stat tool. It's compiled and built with the C Media Driver. See C Media Driver.

三、Stream Stat 

Stream Stat 位于 Aeron samples 目录中,可从 aeron-all jar 启动,如下所示。与 AeronStat 一样,必须将 StreamStat 指向Media Driver 目录。

java -cp aeron-all-*.jar -Daeron.dir=/dev/shm/md io.aeron.samples.StreamStat

 Stream stat 提供了媒Media Driver中每个流的视图,包括publisher和sender视图。该视图与 aeron stat 很相似,只是视图是扁平的。为便于在页面上显示,单行 2 被分成下面的第 2-10 行。

Command `n Control file /dev/shm/md/cnc.dat
sessionId=-1245628686 streamId=10 channel=aeron:udp?endpoint=localhost:40123 : pub-pos (sampled):3:320 pub-lmt:3:8388992 snd-pos:3:384 snd-lmt:3:131456 sub-pos:1:384 rcv-hwm:4:384 rcv-pos:4:384

四、Backlog Stat

Backlog Stat 是一款突出显示数据流积压情况的工具。它可在 IPC 和 UDP 通道上运行。与 AeronStat 一样,您必须将 BacklogStat 指向Media Driver目录。

java -cp aeron-all-*.jar -Daeron.dir=/dev/shm/md io.aeron.samples.BacklogStat

Sample output:

sessionId=1155221173 streamId=8 channel=aeron:udp?endpoint=10.1.1.1:4000 :
┌─for publisher 77 the last sampled position is 187392 (~0 bytes before back-pressure)
└─sender 77 has to send 0 bytes (2031779 butes remaining in the sender window)sessionId=-614368527 streamId=9 channel=aeron:udp?endpoint=10.1.1.1:4001 :
┌─for publisher 6333 the last sampled position is 12739208 (~0 bytes before back-pressure)
└─sender 6333 has to send 65373 bytes (2031779 butes remaining in the sender window)

该工具可突出显示指定通道中的数据积压问题。在上面运行的示例中,顶部会话没有积压数据,而底部会话有 65373 字节的未清积压数据。利用这些信息调查网络、进程和/或设计(network, process and/or design)问题。

五、Loss Stat

LossStat 会记录 Aeron 遭受的所有数据丢失事件。请注意,IPC 数据不会丢失,也不会出现在 LossStat 中。与 AeronStat 一样,您必须将 LossStat 指向Media Driver目录。

java -cp aeron-all-*.jar -Daeron.dir=/dev/shm/md io.aeron.samples.LossStat

An example run:

#OBSERVATION_COUNT,TOTAL_BYTES_LOST,FIRST_OBSERVATION,LAST_OBSERVATION,SESSION_ID,STREAM_ID,CHANNEL,SOURCE
688,4167028,2020-08-16 13:53:39.053+0000,2020-08-16 13:53:41.003+0000,1155221173,8,aeron:udp?endpoint=10.1.1.1:4000;10.1.1.2:60950

这将告诉我们以下有关流 8 ⤌⤍ 10.1.1.2:60950 流量上通道 aeron:udp?endpoint=10.1.1.1:4000 的会话 1155221173 的信息:

  • there were 688 data loss events
  • 共影响 4,167,028 个字节
  • the loss first happened at 2020-08-16 16:53:39.053+0000
  • the last loss happened at 2020-08-16 16:53:41.003+0000

有了这些信息,您就可以在这些时间段内调查任何网络或主机问题。请注意,少量损失是相当常见的。

Note: There is a C version of the Loss Stat tool. It's compiled and built with the C Media Driver. See C Media Driver.

六、Log Inspector

Log Inspector 位于 Aeron samples 文件夹中,可从 aeron-all jar 启动,如下所示。您必须将Log Inspector 直接指向一个 LogBuffer 文件。

java -cp aeron-all-*.jar io.aeron.samples.LogInspector <logbuffer file>

日志检查器(Log Inspector )允许我们检查日志缓冲区(Log Buffer )文件,包括:

  • if the log buffer is connected
  • log buffer经过了多少term(how many terms the log buffer has been through (see Log Buffers & Images))
  • log buffer中3个term的状态(the state of the 3 terms in the log buffer)
  • 和术语(term)内的数据,以十六进制转储。其中包括产生数据的会话和数据流的详细信息。(and the data within a term, dumped as hex. This includes details on which session and stream produced the data.)
======================================================================
Thu Dec 31 09:46:19 EST 2020 Inspection dump for 3.logbuffer
======================================================================Is Connected: true
Initial term id: -1822262504Term Count: 20Active index: 2Term length: 67108864MTU length: 1408Page Size: 4096EOS Position: 9223372036854775807default DATA Header{frame-length=0 version=0 flags=11000000 type=1 term-offset=0 session-id=301746870 stream-id=10 term-id=-1822262504 reserved-value=0}Index 0 Term Meta Data termOffset=67108928 termId=-1822262486 rawTail=-7826557782030548928 position=1275068416
Index 1 Term Meta Data termOffset=67108928 termId=-1822262485 rawTail=-7826557777735581632 position=1342177280
Index 2 Term Meta Data termOffset=1822720 termId=-1822262484 rawTail=-7826557773505900544 position=1344000000======================================================================
Index 0 Term Data0: DATA Header{frame-length=0 version=0 flags=00000000 type=0 term-offset=0 session-id=0 stream-id=0 term-id=0 reserved-value=0}
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000======================================================================
Index 1 Term Data0: DATA Header{frame-length=0 version=0 flags=00000000 type=0 term-offset=0 session-id=0 stream-id=0 term-id=0 reserved-value=0}
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000======================================================================
Index 2 Term Data0: DATA Header{frame-length=36 version=0 flags=11000000 type=1 term-offset=0 session-id=301746870 stream-id=10 term-id=-1822262484 reserved-value=0}
02004001
64: DATA Header{frame-length=36 version=0 flags=11000000 type=1 term-offset=64 session-id=301746870 stream-id=10 term-id=-1822262484 reserved-value=0}
03004001
...

 

这篇关于Aeron:Aeron Tooling的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/1070455

相关文章

Aeron:Online Resources

Aeron Wiki 是一个极好的资源,由 Aeron 团队随时更新。 一、Conference Videos Martin Thomson introducing Aeron, Strange Loop, 2014 https://youtu.be/tM4YskS94b0 Todd Montgomery discussing the next steps for Aeron, Go

Eclipse启动之后弹出框报错:An internal error occurred during: “Initializing Spring Tooling

Eclipse启动之后弹出框报错:An internal error occurred during: "Initializing Spring Tooling".   解决办法: 1.eclipse中Project——>Clean——>勾选Clean all project,Clean

技术速递|Java on Azure Tooling 3月更新 - Java on Azure 开发工具未来六个月路线图发布

作者:Jialuo Gan - Program Manager, Developer Division At Microsoft 排版:Alan Wang 大家好,欢迎阅读 Java on Azure 工具的三月更新。在本次更新中,我们将分享未来几个月对 Java on Azure 开发工具的投资。此外,我们还将介绍对使用 Connection Strings 管理资源。 我们希望您喜欢这

STS 启动之后, Initializing Java Tooling 卡住问题解决

STS 启动之后, "Initializing Java Tooling",其他操作均被阻塞,导致无法正常工作, 解决方案: 删除当前工作目录下的workspace/.metadata/.plugins/org.eclipse.core.resources/.project,然后重新启动STS

Java on Azure Tooling 2024年1月更新|Azure Key Vault 支持、示例项目创建支持及更多

作者:Jialuo Gan - Program Manager, Developer Division At Microsoft 排版:Alan Wang 大家好,欢迎来到 2024 年 Java on Azure 工具的首次更新。在本次更新中,我们将介绍对于 Azure Key Vault 支持、基于 Azure 示例项目的创建支持以及 Azure Kubernetes 体验增强。希望您

java异常情况:Initializing Java Tooling

3.启动Eclipse报错: An internal error occurred during: “Initializing Java Tooling”. java.lang.NullPointerException 问题说明: 在“初始化Java工具”期间发生内部错误。java空指针异常 解决方法 : (1)、第一步删除工作空间目录下的项目文件: eclipse-workspace.m

深入理解 Gradle Tooling API

动手点关注 干货不迷路 👆 1. 简介 构建系统是用来从源代码生成目标产物的自动化工具,目标产物包括库、可执行文件、生成的脚本等,构建系统一般会提供平台相关的可执行程序,外部通过执行命令的形式触发构建,如 GUN Make、Ant、CMake、Gradle 等等。Gradle 是一个灵活而强大的开源构建系统,它提供了跨平台的可执行程序,供外部在命令行窗口通过命令执行 Gradle 构建,如 .

UI5 Tooling

UI5 Tooling UI5 Tooling是一个基于Node.js开发的开源项目。它提供了一个模块化的、可配置的和可扩展的命令行接口,为应用程序、库和重用组件的高效开发定制了UI5框架。 动机 开源工具重用,灵活扩展,宜集成,先进语言特性,多IDE适用。总之,好用高效。 从架构去了解UI5 Tooling,我的理解是首先它是一组命令行工具,基于NodeJS集成了File Syst

eclipes initializing java tooling,解决Eclipse启动时报Initializing Java Tooling异常信息

1.启动Eclipse报错:An internal error occurred during: "Initializing Java Tooling".java.lang.NullPointerException 2.解决方法: 首先关闭Eclipse,删除工作空间D:\eclipse-workspace\.metadata\.plugins\org.eclipse.core.resourc

【树莓派】Eclipse集成Docker Tooling

由于项目往后的发展方向都是容器化为主,所以后端开发人员也需要使用懂得使用Docker进行打包部署。如果通过纯命令行或者脚本的方式进行操作,过程会比较复杂还好Eclipse提供了可视化的插件Docker Tooling,基本操作都可以通过此插件完成大大提高效率。 安装Docker Tooling 打开 eclipse marketplace在find输入框中输入docker就可以找到 docke