（三）Kafka 监控之 Streams 监控（Streams Monitoring）和其他

本文主要是介绍（三）Kafka 监控之 Streams 监控（Streams Monitoring）和其他，希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

一. 前言

二. Kafka Streams 监控（Streams Monitoring）

2.7. RocksDB 指标（RocksDB Metrics）

2.8. 记录缓存指标（Record Cache Metrics）

三. 其他（Other）

一. 前言

接上一篇《（二）Kafka 监控之 Streams 监控（Streams Monitoring）》，本文从 2.7 小节开始。

二. Kafka Streams 监控（Streams Monitoring）

2.7. RocksDB 指标（RocksDB Metrics）

原文引用：RocksDB metrics are grouped into statistics-based metrics and properties-based metrics. The former are recorded from statistics that a RocksDB state store collects whereas the latter are recorded from properties that RocksDB exposes. Statistics collected by RocksDB provide cumulative measurements over time, e.g. bytes written to the state store. Properties exposed by RocksDB provide current measurements, e.g., the amount of memory currently used. Note that the store-scope for built-in RocksDB state stores are currently the following:

rocksdb-state (for RocksDB backed key-value store)
rocksdb-window-state (for RocksDB backed window store)
rocksdb-session-state (for RocksDB backed session store)

RocksDB 指标分为基于统计的指标和基于属性的指标。前者是从 RocksDB 状态存储收集的统计数据中记录的，而后者是从 RocksDB 公开的属性中记录的。RocksDB 收集的统计数据提供了一段时间内的累积测量值，例如写入状态存储的字节数。RocksDB 公开的属性提供当前测量值，例如当前使用的内存量。请注意，内置 RocksDB 状态存储的存储范围当前如下：

rocksdb-state（用于 RocksDB 支持的键值存储）
rocksdb-window-state（用于 RocksDB 支持的窗口存储）
rocksdb-session-state（用于 RocksDB 支持的会话存储）。

原文引用：RocksDB Statistics-based Metrics: All of the following statistics-based metrics have a recording level of debug because collecting statistics in RocksDB may have an impact on performance. Statistics-based metrics are collected every minute from the RocksDB state stores. If a state store consists of multiple RocksDB instances, as is the case for WindowStores and SessionStores, each metric reports an aggregation over the RocksDB instances of the state store.

RocksDB 基于统计的指标：以下所有基于统计的指标都有 debug 级别的记录，因为在 RocksDB 中收集统计数据可能会对性能产生影响。每分钟从 RocksDB 状态存储中收集基于统计的指标。如果一个状态存储由多个 RocksDB 实例组成，就像 WindowStores 和 SessionStores 的情况一样，每个指标都会报告状态存储的 RocksDB 示例的聚合。

METRIC/ATTRIBUTE NAME	DESCRIPTION	MBEAN NAME
bytes-written-rate	The average number of bytes written per second to the RocksDB state store.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
bytes-written-total	The total number of bytes written to the RocksDB state store.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
bytes-read-rate	The average number of bytes read per second from the RocksDB state store.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
bytes-read-total	The total number of bytes read from the RocksDB state store.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
memtable-bytes-flushed-rate	The average number of bytes flushed per second from the memtable to disk.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
memtable-bytes-flushed-total	The total number of bytes flushed from the memtable to disk.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
memtable-hit-ratio	The ratio of memtable hits relative to all lookups to the memtable.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
memtable-flush-time-avg	The average duration of memtable flushes to disc in ms.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
memtable-flush-time-min	The minimum duration of memtable flushes to disc in ms.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
memtable-flush-time-max	The maximum duration of memtable flushes to disc in ms.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
block-cache-data-hit-ratio	The ratio of block cache hits for data blocks relative to all lookups for data blocks to the block cache.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
block-cache-index-hit-ratio	The ratio of block cache hits for index blocks relative to all lookups for index blocks to the block cache.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
block-cache-filter-hit-ratio	The ratio of block cache hits for filter blocks relative to all lookups for filter blocks to the block cache.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
write-stall-duration-avg	The average duration of write stalls in ms.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
write-stall-duration-total	The total duration of write stalls in ms.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
bytes-read-compaction-rate	The average number of bytes read per second during compaction.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
bytes-written-compaction-rate	The average number of bytes written per second during compaction.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
compaction-time-avg	The average duration of disc compactions in ms.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
compaction-time-min	The minimum duration of disc compactions in ms.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
compaction-time-max	The maximum duration of disc compactions in ms.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
number-open-files	The number of current open files.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
number-file-errors-total	The total number of file errors occurred.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)

原文引用：RocksDB Properties-based Metrics: All of the following properties-based metrics have a recording level of info and are recorded when the metrics are accessed. If a state store consists of multiple RocksDB instances, as is the case for WindowStores and SessionStores, each metric reports the sum over all the RocksDB instances of the state store, except for the block cache metrics block-cache-*. The block cache metrics report the sum over all RocksDB instances if each instance uses its own block cache, and they report the recorded value from only one instance if a single block cache is shared among all instances.

基于 RocksDB 属性的指标：以下所有基于属性的指标都有 info 级别的信息，并在访问这些指标时进行记录。如果一个状态存储由多个 RocksDB 实例组成，就像 WindowStores 和SessionStores 的情况一样，每个指标都会报告状态存储的所有 RocksDB 实例的总和，但块缓存指标 metrics block-cache-* 除外。如果每个实例使用自己的块缓存，则块缓存指标报告所有RocksDB 实例的总和；如果在所有实例之间共享单个块缓存，那么块缓存指标仅报告一个实例的记录值。

METRIC/ATTRIBUTE NAME	DESCRIPTION	MBEAN NAME
num-immutable-mem-table	The number of immutable memtables that have not yet been flushed.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
cur-size-active-mem-table	The approximate size of the active memtable in bytes.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
cur-size-all-mem-tables	The approximate size of active and unflushed immutable memtables in bytes.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
size-all-mem-tables	The approximate size of active, unflushed immutable, and pinned immutable memtables in bytes.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
num-entries-active-mem-table	The number of entries in the active memtable.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
num-entries-imm-mem-tables	The number of entries in the unflushed immutable memtables.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
num-deletes-active-mem-table	The number of delete entries in the active memtable.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
num-deletes-imm-mem-tables	The number of delete entries in the unflushed immutable memtables.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
mem-table-flush-pending	This metric reports 1 if a memtable flush is pending, otherwise it reports 0.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
num-running-flushes	The number of currently running flushes.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
compaction-pending	This metric reports 1 if at least one compaction is pending, otherwise it reports 0.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
num-running-compactions	The number of currently running compactions.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
estimate-pending-compaction-bytes	The estimated total number of bytes a compaction needs to rewrite on disk to get all levels down to under target size (only valid for level compaction).	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
total-sst-files-size	The total size in bytes of all SST files.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
live-sst-files-size	The total size in bytes of all SST files that belong to the latest LSM tree.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
num-live-versions	Number of live versions of the LSM tree.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
block-cache-capacity	The capacity of the block cache in bytes.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
block-cache-usage	The memory size of the entries residing in block cache in bytes.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
block-cache-pinned-usage	The memory size for the entries being pinned in the block cache in bytes.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
estimate-num-keys	The estimated number of keys in the active and unflushed immutable memtables and storage.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
estimate-table-readers-mem	The estimated memory in bytes used for reading SST tables, excluding memory used in block cache.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)
background-errors	The total number of background errors.	kafka.streams:type=stream-state-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),[store-scope]-id=([-.\w]+)

2.8. 记录缓存指标（Record Cache Metrics）

原文引用：All of the following metrics have a recording level of debug:

以下所有指标都具有 debug 级别的记录：

METRIC/ATTRIBUTE NAME	DESCRIPTION	MBEAN NAME
hit-ratio-avg	The average cache hit ratio defined as the ratio of cache read hits over the total cache read requests.	kafka.streams:type=stream-record-cache-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),record-cache-id=([-.\w]+)
hit-ratio-min	The minimum cache hit ratio.	kafka.streams:type=stream-record-cache-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),record-cache-id=([-.\w]+)
hit-ratio-max	The maximum cache hit ratio.	kafka.streams:type=stream-record-cache-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),record-cache-id=([-.\w]+)

三. 其他（Other）

原文引用：We recommend monitoring GC time and other stats and various server stats such as CPU utilization, I/O service time, etc. On the client side, we recommend monitoring the message/byte rate (global and per topic), request rate/size/time, and on the consumer side, max lag in messages among all partitions and min fetch request rate. For a consumer to keep up, max lag needs to be less than a threshold and min fetch rate needs to be larger than 0.

我们建议监控 GC 时间和其他统计数据以及各种服务器统计数据，如 CPU 利用率、I/O 服务时间等。在客户端，我们建议监控 message/byte 速率（全局和每个 Topic）、请求速率/大小/时间，在消费者端，监控所有分区之间消息的最大滞后和最小获取请求速率。为了让消费者跟上，最大滞后需要小于阈值，最小获取速率需要大于0。

这篇关于（三）Kafka 监控之 Streams 监控（Streams Monitoring）和其他的文章就介绍到这儿，希望我们推荐的文章对编程师们有所帮助！