记录一次19c RAC启动关闭很慢的问题

2023-11-09 15:36

本文主要是介绍记录一次19c RAC启动关闭很慢的问题,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

背景
虚拟机下测试用的19c的RAC,在安装RU的时候,一个节点安装时间可以接受,另一个节点安装时间很长。(时间长,主要耗在关闭集群和启动集群)
平时在启动关闭rac的时候,时间较长,尤其是关闭rac的时候,耗时很长。

-- 在安装19.19的时候,节点1耗时78分钟,节点2耗时50分钟
 

OPatchauto session completed at Tue Jun  6 16:59:30 2023
Time taken to complete the session 78 minutes, 26 seconds
[root@node19c01 grid]# OPatchauto session completed at Tue Jun  6 15:38:29 2023
Time taken to complete the session 50 minutes, 11 seconds
[root@node19c02 psu]# 

-- 在安装19.20的时候,节点2耗时72分钟

OPatchauto session completed at Thu Aug 17 10:15:58 2023
Time taken to complete the session 72 minutes, 13 seconds
[root@node19c02 bin]# 

-- 在安装19.19的时候,节点1耗时43分钟,节点2耗时223分钟

OPatchauto session completed at Sat Nov  4 17:47:11 2023
Time taken to complete the session 43 minutes, 41 seconds
[root@node19c01 ~]#OPatchauto session completed at Sat Nov  4 21:34:26 2023
Time taken to complete the session 223 minutes, 51 seconds
[root@node19c02 35642822]#
[root@node19c02 35642822]#

-- 通过查看打补丁的过程,发现这两个步骤很耗时。尤其是bring down,关闭集群的时候,比较耗时
Preparing to bring down database service on home /u01/app/oracle/product/19.0.0/db_1
Performing postpatch operations on CRS - starting CRS service on home /u01/app/19.0.0/grid

-- 在手工启动集群的时候,相对较慢,但可以接受,有一次是卡在CRS-4537: Cluster Ready Services is online
-- 在手工关闭集群的时候,非常慢卡在这个地方 CRS-2677: Stop of 'ora.chad' on 'node19c02' succeeded 
-- 最后一次较长时间,卡在CRS-2677: Stop of 'ora.node19c02.vip' on 'node19c02' succeeded 

--查看alert log  有较多的ora.asm等信息 。GI关闭了2小时多。有一些check ora.asm的信息,有查找dns192.168.2.1的信息

2020-05-01 16:04:14.508 [ORAAGENT(10234)]CRS-5818: Aborted command 'check' for resource 'ora.asm'. Details at (:CRSAGF00113:) {0:10:10} in /u01/app/grid/diag/crs/node19c02/crs/trace/crsd_oraagent_grid.trc.
2020-05-01 16:04:45.505 [ORAAGENT(10234)]CRS-5818: Aborted command 'check' for resource 'ora.asm'. Details at (:CRSAGF00113:) {0:1:17} in /u01/app/grid/diag/crs/node19c02/crs/trace/crsd_oraagent_grid.trc.
2020-05-01 16:05:35.642 [ORAAGENT(10234)]CRS-5818: Aborted command 'check' for resource 'ora.asm'. Details at (:CRSAGF00113:) {0:10:12} in /u01/app/grid/diag/crs/node19c02/crs/trace/crsd_oraagent_grid.trc.2020-05-01 16:17:54.224 [ORAAGENT(10234)]CRS-5818: Aborted command 'check' for resource 'ora.asm'. Details at (:CRSAGF00113:) {0:10:32} in /u01/app/grid/diag/crs/node19c02/crs/trace/crsd_oraagent_grid.trc.
2020-05-01 16:18:12.596 [ORAAGENT(17237)]CRS-8500: Oracle Clusterware ORAAGENT process is starting with operating system process ID 17237
2020-05-01 16:18:25.224 [ORAAGENT(10234)]CRS-5818: Aborted command 'check' for resource 'ora.asm'. Details at (:CRSAGF00113:) {0:1:11} in /u01/app/grid/diag/crs/node19c02/crs/trace/crsd_oraagent_grid.trc.
2020-05-01 16:18:25.632 [OHASD(1354)]CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'node19c02'
2020-05-01 16:19:15.368 [ORAAGENT(10234)]CRS-5818: Aborted command 'check' for resource 'ora.asm'. Details at (:CRSAGF00113:) {0:10:34} in /u01/app/grid/diag/crs/node19c02/crs/trace/crsd_oraagent_grid.trc.ag/crs/node19c02/crs/trace/crsd_oraagent_grid.trc.
2023-06-06 10:56:30.111 [ORAAGENT(9379)]CRS-5818: Aborted command 'res_attr_modified' for resource 'ora.ons'. Details at (:CRSAGF00113:) {0:6:2} in /u01/app/grid/diag/crs/node19c02/crs/trace/crsd_oraagent_grid.trc.
2023-06-06 10:57:30.737 [ORAAGENT(10450)]CRS-8500: Oracle Clusterware ORAAGENT process is starting with operating system process ID 10450
2023-06-06 10:58:01.065 [ORAAGENT(10450)]CRS-5818: Aborted command 'check' for resource 'ora.asm'. Details at (:CRSAGF00113:) {0:7:2} in /u01/app/grid/diag/crs/node19c02/crs/trace/crsd_oraagent_grid.trc.
2023-06-06 11:07:48.782 [ORAAGENT(11765)]CRS-8500: Oracle Clusterware ORAAGENT process is starting with operating system process ID 11765
2023-06-06 11:11:47.098 [OHASD(1310)]CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'node19c02'
2023-06-06 11:12:20.649 [ORAROOTAGENT(2214)]CRS-5822: Agent '/u01/app/19.0.0/grid/bin/orarootagent_root' disconnected from server. Details at (:CRSAGF00117:) {0:3:70} in /u01/app/grid/diag/crs/node19c02/crs/trace/crsd_orarootagent_root.trc.
2023-06-06 11:12:20.656 [ORAAGENT(10450)]CRS-5822: Agent '/u01/app/19.0.0/grid/bin/oraagent_grid' disconnected from server. Details at (:CRSAGF00117:) {0:7:9} in /u01/app/grid/diag/crs/node19c02/crs/trace/crsd_oraagent_grid.trc.2023-06-06 11:12:37.449 [OCTSSD(2043)]CRS-2405: The Cluster Time Synchronization Service on host node19c02 is shutdown by user
2023-06-06 11:12:37.450 [OCTSSD(2043)]CRS-8504: Oracle Clusterware OCTSSD process with operating system process ID 2043 is exiting
2023-06-06 11:12:38.572 [OCSSD(1712)]CRS-1603: CSSD on node node19c02 has been shut down.
2023-06-06 11:12:41.601 [GPNPD(1551)]CRS-2329: GPNPD on node node19c02 shut down.
2023-06-06 11:13:03.869 [OHASD(1310)]CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'node19c02' has completed
2023-06-06 11:13:03.892 [ORAROOTAGENT(1411)]CRS-5822: Agent '/u01/app/19.0.0/grid/bin/orarootagent_root' disconnected from server. Details at (:CRSAGF00117:) {0:4:15} in /u01/app/grid/diag/crs/node19c02/crs/trace/ohasd_orarootagent_root.trc.2023-06-06 16:15:22.172 [CVUD(1751)]CRS-10051: CVU found following errors with Clusterware setup : PRVG-10048 : Name "node19c01" was not resolved to an address of the specified type by name servers "192.168.2.1".2023-06-06 23:29:08.319 [CVUD(69952)]CRS-10051: CVU found following errors with Clusterware setup : PRVF-5622 : The 'search' entry does not exist in file "/etc/resolv.conf" on nodes: "node19c01".
PRVG-11372 : Number of SCAN IP addresses that SCAN "scan19c" resolved to did not match the number of SCAN VIP resources
PRVG-1101 : SCAN name "scan19c" failed to resolve2023-08-16 13:16:20.626 [ORAAGENT(15597)]CRS-5818: Aborted command 'check' for resource 'ora.asm'. Details at (:CRSAGF00113:) {2:57504:1091} in /u01/app/grid/diag/crs/node19c02/crs/trace/crsd_oraagent_grid.trc.
2023-08-16 13:17:00.706 [ORAAGENT(15597)]CRS-5818: Aborted command 'check' for resource 'ora.asm'. Details at (:CRSAGF00113:) {0:9:2} in /u01/app/grid/diag/crs/node19c02/crs/trace/crsd_oraagent_grid.trc.
2023-08-16 13:17:40.731 [ORAAGENT(15597)]CRS-5818: Aborted command 'check' for resource 'ora.asm'. Details at (:CRSAGF00113:) {0:9:2} in /u01/app/grid/diag/crs/node19c02/crs/trace/crsd_oraagent_grid.trc.
2023-08-16 13:18:06.887 [CVUD(3151)]CRS-10051: CVU found following errors with Clusterware setup : Refer to My Oracle Support notes "1357657.1" for more details regarding errors "PRVG-11067".
Refer to My Oracle Support notes "1357657.1" for more details regarding errors "PRVG-11067".
PRVF-5622 : The 'search' entry does not exist in file "/etc/resolv.conf" on nodes: "node19c01".
Refer to My Oracle Support notes "1357657.1" for more details regarding errors "PRVG-11067".
PRVG-11372 : Number of SCAN IP addresses that SCAN "scan19c" resolved to did not match the number of SCAN VIP resources
PRVG-1101 : SCAN name "scan19c" failed to resolve2023-08-17 08:08:02.411 [CRSD(3019)]CRS-2771: Maximum restart attempts reached for resource 'ora.node19c02.vip'; will not restart.
2023-08-17 08:08:02.737 [ORAAGENT(15554)]CRS-5016: Process "/u01/app/19.0.0/grid/bin/lsnrctl" spawned by agent "ORAAGENT" for action "check" failed: details at "(:CLSN00010:)" in "/u01/app/grid/diag/crs/node19c02/crs/trace/crsd_oraagent_grid.trc"
2023-08-17 08:08:03.537 [GIPCD(1795)]CRS-42216: No interfaces are configured on the local node for interface definition ens34(:.*)?:10.10.10.0: available interface definitions are [ens33(:.*)?:192.168.2.0][ens34:1(:.*)?:169.254.0.0][ens33(:.*)?:[fe80:0:0:0:0:0:0:0]][ens34(:.*)?:[fe80:0:0:0:0:0:0:0]].2023-08-17 08:08:30.185 [GIPCD(1795)]CRS-42216: No interfaces are configured on the local node for interface definition ens34(:.*)?:10.10.10.0: available interface definitions are [ens33(:.*)?:192.168.2.0][ens34:1(:.*)?:169.254.0.0][ens33(:.*)?:[fe80:0:0:0:0:0:0:0]][ens34(:.*)?:[fe80:0:0:0:0:0:0:0]].
2023-08-17 08:08:31.144 [OCSSD(1891)]CRS-1609: This node is unable to communicate with other nodes in the cluster and is going down to preserve cluster integrity; details at (:CSSNM00008:) in /u01/app/grid/diag/crs/node19c02/crs/trace/ocssd.trc.
2023-08-17 08:08:31.186 [OCSSD(1891)]CRS-1656: The CSS daemon is terminating due to a fatal error; Details at (:CSSSC00012:) in /u01/app/grid/diag/crs/node19c02/crs/trace/ocssd.trc2023-08-17 08:08:35.699 [GIPCD(1795)]CRS-42216: No interfaces are configured on the local node for interface definition ens34(:.*)?:10.10.10.0: available interface definitions are [ens33(:.*)?:192.168.2.0][ens34:1(:.*)?:169.254.0.0][ens33(:.*)?:[fe80:0:0:0:0:0:0:0]][ens34(:.*)?:[fe80:0:0:0:0:0:0:0]].
2023-08-17 08:08:36.383 [ORAAGENT(1666)]CRS-5011: Check of resource "ora.asm" failed: details at "(:CLSN00006:)" in "/u01/app/grid/diag/crs/node19c02/crs/trace/ohasd_oraagent_grid.trc"/diag/crs/node19c02/crs/trace/crsd_oraagent_grid.trc.
2023-11-04 11:10:07.792 [OHASD(1553)]CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'node19c02'
2023-11-04 11:10:27.634 [OCSSD(2111)]CRS-1625: Node node19c01, number 1, was shut down
2023-11-04 13:33:01.295 [OCTSSD(2517)]CRS-8504: Oracle Clusterware OCTSSD process with operating system process ID 2517 is exiting
2023-11-04 13:33:02.427 [OCSSD(2111)]CRS-1603: CSSD on node node19c02 has been shut down.
2023-11-04 13:33:04.790 [ORAROOTAGENT(98836)]CRS-8500: Oracle Clusterware ORAROOTAGENT process is starting with operating system process ID 98836
2023-11-04 13:33:05.415 [GPNPD(1902)]CRS-2329: Grid Plug and Play Daemon(GPNPD) on node node19c02 shut down.
2023-11-04 13:33:28.198 [OHASD(1553)]CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'node19c02' has completed

--查看asm实例的log,无异常

-- 查看相关的trc文件,无太多有用的信息

2023-11-08 15:40:23.814 : USRTHRD:930125568: [     INFO] {0:9:2} dumpAsmLsnrReloadVec AsmLsnr Res = ora.ASMNET1LSNR_ASM.lsnr, Reload done = 1
2023-11-08 15:40:24.029 :CLSDYNAM:1897694976: [ora.DATA.dg]{0:9:2} [check] DgpAgent::runCheck 220 check if ASM failed
2023-11-08 15:40:24.029 :CLSDYNAM:1897694976: [ora.DATA.dg]{0:9:2} [check] DgpAgent::queryDgStatus 130 dgName DGStatus is not cached.2023-11-08 15:40:24.030 : USRTHRD:1897694976: [     INFO] {0:9:2} Thread:DGStatusUpdater thread constructor exit this:4c11f680 m_pThnd:0 m_thndMX:4c11f6a0, m_tintMX:4c11f6f0 &m_postMX:0x7f6b4c11f6d0
2023-11-08 15:40:24.030 :CLSDYNAM:1912403712: [ora.OCR.dg]{0:9:2} [check] DgpAgent::runCheck 220 check if ASM failed
2023-11-08 15:40:24.030 :CLSDYNAM:1912403712: [ora.OCR.dg]{0:9:2} [check] DgpAgent::queryDgStatus 130 dgName DGStatus is not cached.

-- 查看节点1 和节点2上的日志,查找resolv.conf。发现有很多解析dns的相关信息,很多网关192.168.2.1的信息

[grid@node19c01 trace]$ more * | grep resolv.conf
2020-05-01 11:32:37.545 [CVUD(3455)]CRS-10051: CVU found following errors with Clusterware setup : PRVF-5622 : The 'search' entry does not exist in file "/etc/resolv.conf" on nodes: "node19c01".
2020-05-01 12:29:34.334 [CVUD(4030)]CRS-10051: CVU found following errors with Clusterware setup : PRVF-5622 : The 'search' entry does not exist in file "/etc/resolv.conf" on nodes: "node19c01".
2023-11-08 16:46:43.956 [CVUD(42984)]CRS-10051: CVU found following errors with Clusterware setup : PRVF-5622 : The 'search' entry does not exist in file "/etc/resolv.conf" on nodes: "node19c01".
[grid@node19c01 trace]$[root@node19c02 trace]# more * | grep resolv.conf
2023-06-06 16:15:22.170 [CVUD(1751)]CRS-10051: CVU found following errors with Clusterware setup : PRVF-5622 : The 'search' entry does not exist in file "/etc/resolv.conf" on nodes: "node19c01".
PRVF-5622 : The 'search' entry does not exist in file "/etc/resolv.conf" on nodes: "node19c01".
2023-06-06 23:29:08.319 [CVUD(69952)]CRS-10051: CVU found following errors with Clusterware setup : PRVF-5622 : The 'search' entry does not exist in file "/etc/resolv.conf" on nodes: "node19c01".
PRVF-5622 : The 'search' entry does not exist in file "/etc/resolv.conf" on nodes: "node19c01".
PRVF-5622 : The 'search' entry does not exist in file "/etc/resolv.conf" on nodes: "node19c01".
2023-08-17 09:00:22.018 [CVUD(3098)]CRS-10051: CVU found following errors with Clusterware setup : PRVF-5622 : The 'search' entry does not exist in file "/etc/resolv.conf" on nodes: "node19c01".
2023-10-24 09:58:56.141 [CVUD(4051)]CRS-10051: CVU found following errors with Clusterware setup : PRVF-5622 : The 'search' entry does not exist in file "/etc/resolv.conf" on nodes: "node19c01".
[root@node19c02 trace]#[grid@node19c01 trace]$ more alert.log | grep 192.168.2.1
PRVG-10048 : Name "node19c01" was not resolved to an address of the specified type by name servers "192.168.2.1".
PRVG-10048 : Name "node19c02" was not resolved to an address of the specified type by name servers "192.168.2.1".
PRVG-10048 : Name "node19c01" was not resolved to an address of the specified type by name servers "192.168.2.1".
PRVG-10048 : Name "node19c01" was not resolved to an address of the specified type by name servers "192.168.2.1".
PRVG-10048 : Name "node19c02" was not resolved to an address of the specified type by name servers "192.168.2.1".
PRVG-10048 : Name "node19c01" was not resolved to an address of the specified type by name servers "192.168.2.1".
[grid@node19c01 trace]$[root@node19c02 trace]# more alert.log | grep 192.168.2.1
PRVG-10048 : Name "node19c02" was not resolved to an address of the specified type by name servers "192.168.2.1".
PRVG-10048 : Name "node19c02" was not resolved to an address of the specified type by name servers "192.168.2.1".
2023-06-06 16:15:22.172 [CVUD(1751)]CRS-10051: CVU found following errors with Clusterware setup : PRVG-10048 : Name "node19c01" was not resolved to an address of the specified type by name servers "192.168.2.1".
PRVG-10048 : Name "node19c02" was not resolved to an address of the specified type by name servers "192.168.2.1".
PRVG-10048 : Name "node19c01" was not resolved to an address of the specified type by name servers "192.168.2.1".
PRVG-10048 : Name "node19c02" was not resolved to an address of the specified type by name servers "192.168.2.1".
2023-06-06 23:29:08.320 [CVUD(69952)]CRS-10051: CVU found following errors with Clusterware setup : PRVG-10048 : Name "node19c01" was not resolved to an address of the specified type by name servers "192.168.2.1".
[root@node19c02 trace]#

-- 查看dns的配置。OS上配置了dns

[root@node19c02 trace]# cat /etc/resolv.conf
# Generated by NetworkManager
search localdomain
nameserver 192.168.2.1
nameserver 192.168.71.2
[root@node19c02 trace]#[grid@node19c01 trace]$ cat /etc/resolv.conf
# Generated by NetworkManager
nameserver 192.168.2.1
[grid@node19c01 trace]$

--查看网卡的配置,配置了网关192.168.2.1

[grid@node19c01 network-scripts]$ more ifcfg-ens33 | grep GATEWAY
GATEWAY=192.168.2.1
[grid@node19c01 network-scripts]$[root@node19c02 network-scripts]# more ifcfg-ens33 | grep GATEWAY
GATEWAY=192.168.2.1
[root@node19c02 network-scripts]#

-- 解决方法
1 清理掉/etc/resolv.conf里面的信息,即,取消dns的配置(这个是主要原因)
2 网卡里面的网关,去掉。

去掉dns信息和网关信息后,启动和关闭集群,正常。可以很快启动,很快关闭掉。

END

这篇关于记录一次19c RAC启动关闭很慢的问题的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/377007

相关文章

好题——hdu2522(小数问题:求1/n的第一个循环节)

好喜欢这题,第一次做小数问题,一开始真心没思路,然后参考了网上的一些资料。 知识点***********************************无限不循环小数即无理数,不能写作两整数之比*****************************(一开始没想到,小学没学好) 此题1/n肯定是一个有限循环小数,了解这些后就能做此题了。 按照除法的机制,用一个函数表示出来就可以了,代码如下

hdu1043(八数码问题,广搜 + hash(实现状态压缩) )

利用康拓展开将一个排列映射成一个自然数,然后就变成了普通的广搜题。 #include<iostream>#include<algorithm>#include<string>#include<stack>#include<queue>#include<map>#include<stdio.h>#include<stdlib.h>#include<ctype.h>#inclu

MySQL数据库宕机,启动不起来,教你一招搞定!

作者介绍:老苏,10余年DBA工作运维经验,擅长Oracle、MySQL、PG、Mongodb数据库运维(如安装迁移,性能优化、故障应急处理等)公众号:老苏畅谈运维欢迎关注本人公众号,更多精彩与您分享。 MySQL数据库宕机,数据页损坏问题,启动不起来,该如何排查和解决,本文将为你说明具体的排查过程。 查看MySQL error日志 查看 MySQL error日志,排查哪个表(表空间

springboot3打包成war包,用tomcat8启动

1、在pom中,将打包类型改为war <packaging>war</packaging> 2、pom中排除SpringBoot内置的Tomcat容器并添加Tomcat依赖,用于编译和测试,         *依赖时一定设置 scope 为 provided (相当于 tomcat 依赖只在本地运行和测试的时候有效,         打包的时候会排除这个依赖)<scope>provided

内核启动时减少log的方式

内核引导选项 内核引导选项大体上可以分为两类:一类与设备无关、另一类与设备有关。与设备有关的引导选项多如牛毛,需要你自己阅读内核中的相应驱动程序源码以获取其能够接受的引导选项。比如,如果你想知道可以向 AHA1542 SCSI 驱动程序传递哪些引导选项,那么就查看 drivers/scsi/aha1542.c 文件,一般在前面 100 行注释里就可以找到所接受的引导选项说明。大多数选项是通过"_

购买磨轮平衡机时应该注意什么问题和技巧

在购买磨轮平衡机时,您应该注意以下几个关键点: 平衡精度 平衡精度是衡量平衡机性能的核心指标,直接影响到不平衡量的检测与校准的准确性,从而决定磨轮的振动和噪声水平。高精度的平衡机能显著减少振动和噪声,提高磨削加工的精度。 转速范围 宽广的转速范围意味着平衡机能够处理更多种类的磨轮,适应不同的工作条件和规格要求。 振动监测能力 振动监测能力是评估平衡机性能的重要因素。通过传感器实时监

缓存雪崩问题

缓存雪崩是缓存中大量key失效后当高并发到来时导致大量请求到数据库,瞬间耗尽数据库资源,导致数据库无法使用。 解决方案: 1、使用锁进行控制 2、对同一类型信息的key设置不同的过期时间 3、缓存预热 1. 什么是缓存雪崩 缓存雪崩是指在短时间内,大量缓存数据同时失效,导致所有请求直接涌向数据库,瞬间增加数据库的负载压力,可能导致数据库性能下降甚至崩溃。这种情况往往发生在缓存中大量 k

6.1.数据结构-c/c++堆详解下篇(堆排序,TopK问题)

上篇:6.1.数据结构-c/c++模拟实现堆上篇(向下,上调整算法,建堆,增删数据)-CSDN博客 本章重点 1.使用堆来完成堆排序 2.使用堆解决TopK问题 目录 一.堆排序 1.1 思路 1.2 代码 1.3 简单测试 二.TopK问题 2.1 思路(求最小): 2.2 C语言代码(手写堆) 2.3 C++代码(使用优先级队列 priority_queue)

用命令行的方式启动.netcore webapi

用命令行的方式启动.netcore web项目 进入指定的项目文件夹,比如我发布后的代码放在下面文件夹中 在此地址栏中输入“cmd”,打开命令提示符,进入到发布代码目录 命令行启动.netcore项目的命令为:  dotnet 项目启动文件.dll --urls="http://*:对外端口" --ip="本机ip" --port=项目内部端口 例: dotnet Imagine.M

Node.js学习记录(二)

目录 一、express 1、初识express 2、安装express 3、创建并启动web服务器 4、监听 GET&POST 请求、响应内容给客户端 5、获取URL中携带的查询参数 6、获取URL中动态参数 7、静态资源托管 二、工具nodemon 三、express路由 1、express中路由 2、路由的匹配 3、路由模块化 4、路由模块添加前缀 四、中间件