本文主要是介绍达梦8 网络中断对系统的影响,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
测试环境:三节点实时主从
版本:--03134283938-20221019-172201-20018
测试1
系统没有启动确认监视器
关闭节点3网卡
登录节点1检查主库状态
显示向节点2发送归档成功,但无法收到节点3的消息,节点1挂起
日志报错如下:
2024-06-06 00:47:38.481 [INFO] database P0000002319 T0000000000000002373 Send archive log to remote instance failed, switch all ep to SUSPEND status success!
2024-06-06 00:47:48.482 [ERROR] database P0000002319 T0000000000000002356 Can't connect to DM server on '192.168.100.102' port(5800) errno(115)
恢复节点3网卡
主库日志信息如下:
2024-06-06 00:58:00.760 [INFO] database P0000002319 T0000000000000002356 mal_site_ctl_link_create startup from mal_site(0) to mal_site(2)!
2024-06-06 00:58:00.760 [INFO] database P0000002319 T0000000000000002356 mal_site_magic_gen site_magic[46500], src_site:0, dst_site:2
2024-06-06 00:58:00.761 [INFO] database P0000002319 T0000000000000002356 site[0] mal_site_ctl_port_set to site[2, IP: 192.168.100.102, port_num: 5800], socket handle = 12, site_magic = 46500
2024-06-06 00:58:00.761 [INFO] database P0000002319 T0000000000000002350 mal_site_port_get site_magic:46500, src_site:0, dst_site:2
2024-06-06 00:58:00.761 [INFO] database P0000002319 T0000000000000002349 mal_site_port_get site_magic:46500, src_site:0, dst_site:2
2024-06-06 00:58:00.768 [INFO] database P0000002319 T0000000000000002355 site[0] mal_site_data_port_set from site[2, IP: 192.168.100.102, port_num: 5800], socket handle = 14, site_magic = 46500
2024-06-06 00:58:00.769 [INFO] database P0000002319 T0000000000000002348 mal_site_port_get site_magic:46500, src_site:0, dst_site:2
2024-06-06 00:58:00.769 [INFO] database P0000002319 T0000000000000002351 mal_site_port_get site_magic:46500, src_site:0, dst_site:2
但检查主库状态依旧是suspend
重启(SHUTDOWN后被watcher自动拉起)数据库后再检查状态恢复正常
测试2
启动节点2上的确认监视器
中断节点3的网络
登录主库检查状态
虽然到TEST3发送归档失败,但主库状态正常
主库日志信息如下:
2024-06-06 01:07:44.807 [ERROR] database P0000002774 T0000000000000002819 [mal recv for arch] mal receive from site(TEST3) failed, begin lsn:622386010, end lsn:622386010, code:-6021
2024-06-06 01:07:44.807 [ERROR] database P0000002774 T0000000000000002819 send realtime archive to instance[TEST3] failed, code = -6021, begin_lsn = 622386010, end_lsn = 622386010!
2024-06-06 01:07:44.811 [INFO] database P0000002774 T0000000000000002819 Send archive log to remote instance failed, switch all ep to SUSPEND status success!
2024-06-06 01:07:46.268 [INFO] database P0000002774 T0000000000000002872 utsk_cmd_add, cmd info: cmd=217, dseq=1717631069, name_in=, begin_lsn=-1!
2024-06-06 01:07:46.268 [INFO] database P0000002774 T0000000000000002872 utsk_set_global_dw_stat, begin, msg_dseq:1717631069
2024-06-06 01:07:46.268 [INFO] database P0000002774 T0000000000000002872 set g_dw_stat from NONE to DW_FAILOVER success, g_dw_recover_stop is 0
2024-06-06 01:07:46.268 [INFO] database P0000002774 T0000000000000002872 utsk_set_global_dw_stat, finished, msg_dseq:1717631069, set code:0
2024-06-06 01:07:47.269 [INFO] database P0000002774 T0000000000000002872 utsk_cmd_add, cmd info: cmd=214, dseq=1717631070, name_in=, begin_lsn=-1!
2024-06-06 01:07:47.269 [INFO] database P0000002774 T0000000000000002832 utsk_cmd_exec, cmd:214, sys_status:SUSPEND, dseq:1717631070
2024-06-06 01:07:47.270 [INFO] database P0000002774 T0000000000000002832 Change TEST3 arch status from VALID to INVALID
2024-06-06 01:07:47.270 [INFO] database P0000002774 T0000000000000002872 utsk_cmd_add, received sql exec cmd:1, dseq:1717631071, sql:ALTER DATABASE OPEN FORCE
日志显示主库被挂起后立刻状态恢复为open
测试3
启动节点2上的确认监视器
中断节点2的网络
登录主库检查状态
网络恢复后节点2也变成了主,集群分裂
登录监视器显示如下:
集群分裂后只能重建
这篇关于达梦8 网络中断对系统的影响的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!