TiDB 排查分析Empty regions 较大原因

2023-11-05 05:51

本文主要是介绍TiDB 排查分析Empty regions 较大原因,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

问题背景:

通过granfan-->pd-->region health 图表监控到empty region 较多,且此集群目前业务并不多,数据量也很小,初始化时只创建了较表。

问题排查:

查看region merge相关参数:

max-merge-region-size:

##控制region merge的大小上限,当region大于指定值时pd不会将相邻的region合并(默认20)

max-merge-region-keys:

##控制region merge的key上限,当region大于指定值时pd不会将相邻的region合并(默认2000)

merge-schedule-limit:

##同时进行region merge调度的任务数,设置为则关闭region merge(默认值8)

如下,这些参数都是默认值,

$ pd-ctl config show max-merge-region-size |grep max-merge-region-size  

"max-merge-region-size": 20,   

$ pd-ctl config show max-merge-region-keys |grep max-merge-region-keys

    "max-merge-region-keys": 200000,

$ pd-ctl config show merge-schedule-limit |grep merge-schedule-limit

    "merge-schedule-limit": 8,

查看region的split和merge频率相关参数,

patrol-region-interval:

##控制replicaChecker检查region监控状态的运行频率,越短则运行越快(10ms)

split-merge-interval:

##控制对同一个region做split和merge的操作间隔,对应新split的region一段时间不会merge(默认1h)

$ pd-ctl config show patrol-region-interval |grep patrol

    "patrol-region-interval": "100ms",

$ pd-ctl config show split-merge-interval |grep split

    "split-merge-interval": "1h0m0s",

测试ddl时与分配的region是否会merge:

突然想到,空regions多的集群在ddl时使用了region预分配参数,查看相关表时只有1个region,在社区咨询老师说是被merge了。

创建测试表并预分配region,查看region数是指定的:

mysql>  create table guo004(id int, name varchar(10))  SHARD_ROW_ID_BITS=4 PRE_SPLIT_REGIONS=4;

Query OK, 0 rows affected (0.51 sec)

mysql> show table guo004 regions;

+-----------+-----------------------------+-----------------------------+-----------+-----------------+---------------------+------------+---------------+------------+----------------------+------------------+

| REGION_ID | START_KEY                   | END_KEY                     | LEADER_ID | LEADER_STORE_ID | PEERS               | SCATTERING | WRITTEN_BYTES | READ_BYTES | APPROXIMATE_SIZE(MB) | APPROXIMATE_KEYS |

+-----------+-----------------------------+-----------------------------+-----------+-----------------+---------------------+------------+---------------+------------+----------------------+------------------+

|     25373 | t_437_                      | t_437_r_576460752303423488  |     25375 |               5 | 25374, 25375, 25376 |          0 |           341 |          0 |                    1 |                0 |

|     25377 | t_437_r_576460752303423488  | t_437_r_1152921504606846976 |     25379 |               5 | 25378, 25379, 25380 |          0 |            29 |          0 |                    1 |                0 |

|     25381 | t_437_r_1152921504606846976 | t_437_r_1729382256910270464 |     25383 |               5 | 25382, 25383, 25384 |          0 |             0 |          0 |                    1 |                0 |

|     25385 | t_437_r_1729382256910270464 | t_437_r_2305843009213693952 |     25387 |               5 | 25386, 25387, 25388 |          0 |             0 |          0 |                    1 |                0 |

|     25389 | t_437_r_2305843009213693952 | t_437_r_2882303761517117440 |     25391 |               5 | 25390, 25391, 25392 |          0 |            29 |          0 |                    1 |                0 |

|     25393 | t_437_r_2882303761517117440 | t_437_r_3458764513820540928 |     25395 |               5 | 25394, 25395, 25396 |          0 |             0 |          0 |                    1 |                0 |

|     25397 | t_437_r_3458764513820540928 | t_437_r_4035225266123964416 |     25399 |               5 | 25398, 25399, 25400 |          0 |            29 |          0 |                    1 |                0 |

|     25401 | t_437_r_4035225266123964416 | t_437_r_4611686018427387904 |     25403 |               5 | 25402, 25403, 25404 |          0 |             0 |          0 |                    1 |                0 |

|     25405 | t_437_r_4611686018427387904 | t_437_r_5188146770730811392 |     25407 |               5 | 25406, 25407, 25408 |          0 |            29 |          0 |                    1 |                0 |

|     25409 | t_437_r_5188146770730811392 | t_437_r_5764607523034234880 |     25411 |               5 | 25410, 25411, 25412 |          0 |             0 |          0 |                    1 |                0 |

|     25413 | t_437_r_5764607523034234880 | t_437_r_6341068275337658368 |     25415 |               5 | 25414, 25415, 25416 |          0 |            29 |          0 |                    1 |                0 |

|     25417 | t_437_r_6341068275337658368 | t_437_r_6917529027641081856 |     25419 |               5 | 25418, 25419, 25420 |          0 |            29 |          0 |                    1 |                0 |

|     25421 | t_437_r_6917529027641081856 | t_437_r_7493989779944505344 |     25423 |               5 | 25422, 25423, 25424 |          0 |             0 |          0 |                    1 |                0 |

|     25425 | t_437_r_7493989779944505344 | t_437_r_8070450532247928832 |     25427 |               5 | 25426, 25427, 25428 |          0 |            29 |          0 |                    1 |                0 |

|     25429 | t_437_r_8070450532247928832 | t_437_r_8646911284551352320 |     25431 |               5 | 25430, 25431, 25432 |          0 |             0 |          0 |                    1 |                0 |

|     25169 | t_437_r_8646911284551352320 |                             |     25171 |               5 | 25170, 25171, 25172 |          0 |             0 |          0 |                    1 |                0 |

+-----------+-----------------------------+-----------------------------+-----------+-----------------+---------------------+------------+---------------+------------+----------------------+------------------+

16 rows in set (0.01 sec)

检查预分配region是否merge

根据以上split-merge-interval参数,1小时后测试表的guo004空regions将被回收:

mysql> mysql> show table guo004 regions;

+-----------+-----------+---------+-----------+-----------------+---------------------+------------+---------------+------------+----------------------+------------------+

| REGION_ID | START_KEY | END_KEY | LEADER_ID | LEADER_STORE_ID | PEERS               | SCATTERING | WRITTEN_BYTES | READ_BYTES | APPROXIMATE_SIZE(MB) | APPROXIMATE_KEYS |

+-----------+-----------+---------+-----------+-----------------+---------------------+------------+---------------+------------+----------------------+------------------+

|     25425 | t_437_    |         |     25427 |               5 | 25426, 25427, 25428 |          0 |          1285 |          0 |                    1 |                0 |

+-----------+-----------+---------+-----------+-----------------+---------------------+------------+---------------+------------+----------------------+------------------+

1 row in set (0.01 sec)

即使是ddl预分配的region,如果没有regions为空的话,都会自动merge.

查看空regions 信息:

参数说明:

approximate_size:估算的region的数据量大小,单位为MB

approximate_keys:估算的region内的key的个数

注意关注approximate_size、approximate_keys 的值。

查看所有空region信息:

$ ./pd-ctl region check empty-region |more

$ ./pd-ctl region check empty-region |more

{

  "count": 272,

  "regions": [

    {

      "id": 202,

      "start_key": "7480000000000000FF5C00000000000000F8",

      "end_key": "7480000000000000FF5E00000000000000F8",

      "epoch": {

        "conf_ver": 71639,

        "version": 43

      },

      "peers": [

        {

          "id": 7796112,

          "store_id": 21

        },

        {

          "id": 7952908,

          "store_id": 2

        },

        {

          "id": 7955094,

          "store_id": 19

        }

      ],

      "leader": {

        "id": 7796112,

        "store_id": 21

      },

      "written_bytes": 0,

      "read_bytes": 0,

      "written_keys": 0,

      "read_keys": 0,

      "approximate_size": 1,

      "approximate_keys": 0

    },

    {

      "id": 217800,

      "start_key": "7480000000000002FF8500000000000000F8",

      "end_key": "7480000000000002FF8700000000000000F8",

      "epoch": {

        "conf_ver": 71369,

        "version": 290

      },

      "peers": [

        {

          "id": 7932728,

          "store_id": 3

        },

        {

          "id": 7954710,

          "store_id": 1

        },

        {

          "id": 7955013,

          "store_id": 5

        }

      ],

      "leader": {

        "id": 7932728,

        "store_id": 3

      },

      "written_bytes": 0,

      "read_bytes": 0,

      "written_keys": 0,

      "read_keys": 0,

      "approximate_size": 1,

      "approximate_keys": 0

    },

--More--

查看单个region信息:

$ ./pd-ctl  region 217800

$ ./pd-ctl  region 217800

{

  "id": 217800,

  "start_key": "7480000000000002FF8500000000000000F8",

  "end_key": "7480000000000002FF8700000000000000F8",

  "epoch": {

    "conf_ver": 71369,

    "version": 290

  },

  "peers": [

    {

      "id": 7932728,

      "store_id": 3

    },

    {

      "id": 7954710,

      "store_id": 1

    },

    {

      "id": 7955013,

      "store_id": 5

    }

  ],

  "leader": {

    "id": 7932728,

    "store_id": 3

  },

  "written_bytes": 0,

  "read_bytes": 0,

  "written_keys": 0,

  "read_keys": 0,

  "approximate_size": 1,

  "approximate_keys": 0

}

或者通过系统表TIKV_REGION_STATUS查看:

(我个人更倾向于查系统表,比对的可视度更强些):

mysql> select REGION_ID,START_KEY,END_KEY,TABLE_ID,DB_NAME,TABLE_NAME,APPROXIMATE_SIZE,APPROXIMATE_KEYS from TIKV_REGION_STATUS where DB_NAME='dzdz' order by START_KEY limit 30;

+-----------+--------------------------------------+--------------------------------------+----------+---------+--------------------------+------------------+------------------+

| REGION_ID | START_KEY                            | END_KEY                              | TABLE_ID | DB_NAME | TABLE_NAME               | APPROXIMATE_SIZE | APPROXIMATE_KEYS |

+-----------+--------------------------------------+--------------------------------------+----------+---------+--------------------------+------------------+------------------+

|    217152 | 7480000000000001FFAD00000000000000F8 | 7480000000000001FFAF00000000000000F8 |      429 | dzdz    | qrtz_blob_triggers       |                1 |                0 |

|    217152 | 7480000000000001FFAD00000000000000F8 | 7480000000000001FFAF00000000000000F8 |      429 | dzdz    | qrtz_blob_triggers       |                1 |                0 |

|    217152 | 7480000000000001FFAD00000000000000F8 | 7480000000000001FFAF00000000000000F8 |      429 | dzdz    | qrtz_blob_triggers       |                1 |                0 |

|    217156 | 7480000000000001FFAF00000000000000F8 | 7480000000000001FFB100000000000000F8 |      431 | dzdz    | qrtz_calendars           |                1 |                0 |

|    217156 | 7480000000000001FFAF00000000000000F8 | 7480000000000001FFB100000000000000F8 |      431 | dzdz    | qrtz_calendars           |                1 |                0 |

|    217160 | 7480000000000001FFB100000000000000F8 | 7480000000000001FFB300000000000000F8 |      433 | dzdz    | qrtz_cron_triggers       |                1 |                0 |

|    217160 | 7480000000000001FFB100000000000000F8 | 7480000000000001FFB300000000000000F8 |      433 | dzdz    | qrtz_cron_triggers       |                1 |                0 |

|    217164 | 7480000000000001FFB300000000000000F8 | 7480000000000001FFB500000000000000F8 |      435 | dzdz    | qrtz_fired_triggers      |                1 |                0 |

|    217164 | 7480000000000001FFB300000000000000F8 | 7480000000000001FFB500000000000000F8 |      435 | dzdz    | qrtz_fired_triggers      |                1 |                0 |

|    217164 | 7480000000000001FFB300000000000000F8 | 7480000000000001FFB500000000000000F8 |      435 | dzdz    | qrtz_fired_triggers      |                1 |                0 |

|    217164 | 7480000000000001FFB300000000000000F8 | 7480000000000001FFB500000000000000F8 |      435 | dzdz    | qrtz_fired_triggers      |                1 |                0 |

|    217164 | 7480000000000001FFB300000000000000F8 | 7480000000000001FFB500000000000000F8 |      435 | dzdz    | qrtz_fired_triggers      |                1 |                0 |

|    217164 | 7480000000000001FFB300000000000000F8 | 7480000000000001FFB500000000000000F8 |      435 | dzdz    | qrtz_fired_triggers      |                1 |                0 |

|    217164 | 7480000000000001FFB300000000000000F8 | 7480000000000001FFB500000000000000F8 |      435 | dzdz    | qrtz_fired_triggers      |                1 |                0 |

|    217164 | 7480000000000001FFB300000000000000F8 | 7480000000000001FFB500000000000000F8 |      435 | dzdz    | qrtz_fired_triggers      |                1 |                0 |

|    217168 | 7480000000000001FFB500000000000000F8 | 7480000000000001FFB700000000000000F8 |      437 | dzdz    | qrtz_job_details         |                1 |                0 |

|    217168 | 7480000000000001FFB500000000000000F8 | 7480000000000001FFB700000000000000F8 |      437 | dzdz    | qrtz_job_details         |                1 |                0 |

|    217168 | 7480000000000001FFB500000000000000F8 | 7480000000000001FFB700000000000000F8 |      437 | dzdz    | qrtz_job_details         |                1 |                0 |

|    217168 | 7480000000000001FFB500000000000000F8 | 7480000000000001FFB700000000000000F8 |      437 | dzdz    | qrtz_job_details         |                1 |                0 |

|    217172 | 7480000000000001FFB700000000000000F8 | 7480000000000001FFB900000000000000F8 |      439 | dzdz    | qrtz_locks               |                1 |                0 |

|    217172 | 7480000000000001FFB700000000000000F8 | 7480000000000001FFB900000000000000F8 |      439 | dzdz    | qrtz_locks               |                1 |                0 |

|    217176 | 7480000000000001FFB900000000000000F8 | 7480000000000001FFBB00000000000000F8 |      441 | dzdz    | qrtz_paused_trigger_grps |                1 |                0 |

|    217176 | 7480000000000001FFB900000000000000F8 | 7480000000000001FFBB00000000000000F8 |      441 | dzdz    | qrtz_paused_trigger_grps |                1 |                0 |

|    217180 | 7480000000000001FFBB00000000000000F8 | 7480000000000001FFBD00000000000000F8 |      443 | dzdz    | qrtz_scheduler_state     |               10 |            20343 |

|    217180 | 7480000000000001FFBB00000000000000F8 | 7480000000000001FFBD00000000000000F8 |      443 | dzdz    | qrtz_scheduler_state     |               10 |            20343 |

|    217184 | 7480000000000001FFBD00000000000000F8 | 7480000000000001FFBF00000000000000F8 |      445 | dzdz    | qrtz_simple_triggers     |                1 |                0 |

|    217184 | 7480000000000001FFBD00000000000000F8 | 7480000000000001FFBF00000000000000F8 |      445 | dzdz    | qrtz_simple_triggers     |                1 |                0 |

|    217188 | 7480000000000001FFBF00000000000000F8 | 7480000000000001FFC100000000000000F8 |      447 | dzdz    | qrtz_simprop_triggers    |                1 |                0 |

|    217188 | 7480000000000001FFBF00000000000000F8 | 7480000000000001FFC100000000000000F8 |      447 | dzdz    | qrtz_simprop_triggers    |                1 |                0 |

|    217192 | 7480000000000001FFC100000000000000F8 | 7480000000000001FFC300000000000000F8 |      449 | dzdz    | qrtz_triggers            |                1 |                0 |

+-----------+--------------------------------------+--------------------------------------+----------+---------+--------------------------+------------------+------------------+

30 rows in set (0.01 sec)

mysql>

通过以上信息显示一个regions的END_KEY 是另一个region的START_KEY, 结合APPROXIMATE_SIZE和APPROXIMATE_KEYS字段的值,可知部分空region是连续的。

检查跨表合并是否启用

enable-cross-table-merge:

##设置是否开启跨表merge(默认v4.0:false,v6.0:true)

$  ./pd-ctl config show enable-cross-table-merge |grep cross

    "enable-cross-table-merge": "false",

$

如上跨表合并未启用

 

测试启用跨表合并是否能降低空region数:

检查当前空region数:

测试环境的empty region 也比较多,如下图:

 

启用跨表合并:

确认夸表合并未启用:

» config show

{

  "replication": {

    "enable-placement-rules": "true",

    "location-labels": "dc,host",

    "max-replicas": 3,

    "strictly-match-label": "false"

  },

  "schedule": {

    "enable-cross-table-merge": "false",

......

  }

}

启用跨表合并:

» config set enable-cross-table-merge true

Success!

确认跨表合并启用

» config show

{

  "replication": {

    "enable-placement-rules": "true",

    "location-labels": "dc,host",

    "max-replicas": 3,

    "strictly-match-label": "false"

  },

  "schedule": {

    "enable-cross-table-merge": "true",

......

  }

}

»  

启用跨表合并后空region数:

在grafana上看region health状态,empty region 数量已经为0:

 

问题处理:

启用跨表合并,可以减少或至0空region数据量

操作命令:$ ./pd-ctl config set enable-cross-table-merge true 即可。

综上所述:

通过granfan-->pd-->region health 图表监控到empty region 较多:

查看region merge大小相关参数:

max-merge-region-size:

##控制region merge的大小上限,当region大于指定值时pd不会将相邻的region合并(默认20)

max-merge-region-keys:

##控制region merge的key上限,当region大于指定值时pd不会将相邻的region合并(默认2000)

merge-schedule-limit:

##同时进行region merge调度的任务数,设置为则关闭region merge(默认值8)

如果以上参数设置太小或不合理可以适当调整让region加速merge

如果以上参数设置合理(一般默认值就可以),请检查region merge频率参数

查看region的split和merge频率相关参数,

patrol-region-interval:

##控制replicaChecker检查region监控状态的运行频率,越短则运行越快(10ms)

split-merge-interval:

##控制对同一个region做split和merge的操作间隔,对应新split的region一段时间不会merge(默认1h)

如果以上参数设置过大或不合理可适当调小,但也不建议过小。

如果设置合理(一般默认值就可以),请检查跨表合并参数。

检查跨表合并是否启用

enable-cross-table-merge:

##设置是否开启跨表merge(默认v4.0:false,v5.0及其后版本:true)

跨表合并参数v5.0及其后版本默认已经启用,使用v5.0及其后版本的应该不太会出现这个问题。

使用v4.0、小于v5.0的版本或enable-cross-table-merge 未启用情况下可能会出现empty region较多。

一般多DDL on table情况下, enable-cross-table-merge结果较为明显。

这篇关于TiDB 排查分析Empty regions 较大原因的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/347858

相关文章

怎样通过分析GC日志来定位Java进程的内存问题

《怎样通过分析GC日志来定位Java进程的内存问题》:本文主要介绍怎样通过分析GC日志来定位Java进程的内存问题,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,望不吝赐教... 目录一、GC 日志基础配置1. 启用详细 GC 日志2. 不同收集器的日志格式二、关键指标与分析维度1.

Java进程异常故障定位及排查过程

《Java进程异常故障定位及排查过程》:本文主要介绍Java进程异常故障定位及排查过程,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,望不吝赐教... 目录一、故障发现与初步判断1. 监控系统告警2. 日志初步分析二、核心排查工具与步骤1. 进程状态检查2. CPU 飙升问题3. 内存

MySQL中的表连接原理分析

《MySQL中的表连接原理分析》:本文主要介绍MySQL中的表连接原理分析,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,望不吝赐教... 目录1、背景2、环境3、表连接原理【1】驱动表和被驱动表【2】内连接【3】外连接【4编程】嵌套循环连接【5】join buffer4、总结1、背景

python中Hash使用场景分析

《python中Hash使用场景分析》Python的hash()函数用于获取对象哈希值,常用于字典和集合,不可变类型可哈希,可变类型不可,常见算法包括除法、乘法、平方取中和随机数哈希,各有优缺点,需根... 目录python中的 Hash除法哈希算法乘法哈希算法平方取中法随机数哈希算法小结在Python中,

Java Stream的distinct去重原理分析

《JavaStream的distinct去重原理分析》Javastream中的distinct方法用于去除流中的重复元素,它返回一个包含过滤后唯一元素的新流,该方法会根据元素的hashcode和eq... 目录一、distinct 的基础用法与核心特性二、distinct 的底层实现原理1. 顺序流中的去重

SpringSecurity显示用户账号已被锁定的原因及解决方案

《SpringSecurity显示用户账号已被锁定的原因及解决方案》SpringSecurity中用户账号被锁定问题源于UserDetails接口方法返回值错误,解决方案是修正isAccountNon... 目录SpringSecurity显示用户账号已被锁定的解决方案1.问题出现前的工作2.问题出现原因各

关于MyISAM和InnoDB对比分析

《关于MyISAM和InnoDB对比分析》:本文主要介绍关于MyISAM和InnoDB对比分析,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,望不吝赐教... 目录开篇:从交通规则看存储引擎选择理解存储引擎的基本概念技术原理对比1. 事务支持:ACID的守护者2. 锁机制:并发控制的艺

javax.net.ssl.SSLHandshakeException:异常原因及解决方案

《javax.net.ssl.SSLHandshakeException:异常原因及解决方案》javax.net.ssl.SSLHandshakeException是一个SSL握手异常,通常在建立SS... 目录报错原因在程序中绕过服务器的安全验证注意点最后多说一句报错原因一般出现这种问题是因为目标服务器

SpringBoot排查和解决JSON解析错误(400 Bad Request)的方法

《SpringBoot排查和解决JSON解析错误(400BadRequest)的方法》在开发SpringBootRESTfulAPI时,客户端与服务端的数据交互通常使用JSON格式,然而,JSON... 目录问题背景1. 问题描述2. 错误分析解决方案1. 手动重新输入jsON2. 使用工具清理JSON3.

MyBatis Plus 中 update_time 字段自动填充失效的原因分析及解决方案(最新整理)

《MyBatisPlus中update_time字段自动填充失效的原因分析及解决方案(最新整理)》在使用MyBatisPlus时,通常我们会在数据库表中设置create_time和update... 目录前言一、问题现象二、原因分析三、总结:常见原因与解决方法对照表四、推荐写法前言在使用 MyBATis