记录一次19c RAC启动关闭很慢的问题

2023-11-09 15:36

本文主要是介绍记录一次19c RAC启动关闭很慢的问题,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

背景
虚拟机下测试用的19c的RAC,在安装RU的时候,一个节点安装时间可以接受,另一个节点安装时间很长。(时间长,主要耗在关闭集群和启动集群)
平时在启动关闭rac的时候,时间较长,尤其是关闭rac的时候,耗时很长。

-- 在安装19.19的时候,节点1耗时78分钟,节点2耗时50分钟
 

OPatchauto session completed at Tue Jun  6 16:59:30 2023
Time taken to complete the session 78 minutes, 26 seconds
[root@node19c01 grid]# OPatchauto session completed at Tue Jun  6 15:38:29 2023
Time taken to complete the session 50 minutes, 11 seconds
[root@node19c02 psu]# 

-- 在安装19.20的时候,节点2耗时72分钟

OPatchauto session completed at Thu Aug 17 10:15:58 2023
Time taken to complete the session 72 minutes, 13 seconds
[root@node19c02 bin]# 

-- 在安装19.19的时候,节点1耗时43分钟,节点2耗时223分钟

OPatchauto session completed at Sat Nov  4 17:47:11 2023
Time taken to complete the session 43 minutes, 41 seconds
[root@node19c01 ~]#OPatchauto session completed at Sat Nov  4 21:34:26 2023
Time taken to complete the session 223 minutes, 51 seconds
[root@node19c02 35642822]#
[root@node19c02 35642822]#

-- 通过查看打补丁的过程,发现这两个步骤很耗时。尤其是bring down,关闭集群的时候,比较耗时
Preparing to bring down database service on home /u01/app/oracle/product/19.0.0/db_1
Performing postpatch operations on CRS - starting CRS service on home /u01/app/19.0.0/grid

-- 在手工启动集群的时候,相对较慢,但可以接受,有一次是卡在CRS-4537: Cluster Ready Services is online
-- 在手工关闭集群的时候,非常慢卡在这个地方 CRS-2677: Stop of 'ora.chad' on 'node19c02' succeeded 
-- 最后一次较长时间,卡在CRS-2677: Stop of 'ora.node19c02.vip' on 'node19c02' succeeded 

--查看alert log  有较多的ora.asm等信息 。GI关闭了2小时多。有一些check ora.asm的信息,有查找dns192.168.2.1的信息

2020-05-01 16:04:14.508 [ORAAGENT(10234)]CRS-5818: Aborted command 'check' for resource 'ora.asm'. Details at (:CRSAGF00113:) {0:10:10} in /u01/app/grid/diag/crs/node19c02/crs/trace/crsd_oraagent_grid.trc.
2020-05-01 16:04:45.505 [ORAAGENT(10234)]CRS-5818: Aborted command 'check' for resource 'ora.asm'. Details at (:CRSAGF00113:) {0:1:17} in /u01/app/grid/diag/crs/node19c02/crs/trace/crsd_oraagent_grid.trc.
2020-05-01 16:05:35.642 [ORAAGENT(10234)]CRS-5818: Aborted command 'check' for resource 'ora.asm'. Details at (:CRSAGF00113:) {0:10:12} in /u01/app/grid/diag/crs/node19c02/crs/trace/crsd_oraagent_grid.trc.2020-05-01 16:17:54.224 [ORAAGENT(10234)]CRS-5818: Aborted command 'check' for resource 'ora.asm'. Details at (:CRSAGF00113:) {0:10:32} in /u01/app/grid/diag/crs/node19c02/crs/trace/crsd_oraagent_grid.trc.
2020-05-01 16:18:12.596 [ORAAGENT(17237)]CRS-8500: Oracle Clusterware ORAAGENT process is starting with operating system process ID 17237
2020-05-01 16:18:25.224 [ORAAGENT(10234)]CRS-5818: Aborted command 'check' for resource 'ora.asm'. Details at (:CRSAGF00113:) {0:1:11} in /u01/app/grid/diag/crs/node19c02/crs/trace/crsd_oraagent_grid.trc.
2020-05-01 16:18:25.632 [OHASD(1354)]CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'node19c02'
2020-05-01 16:19:15.368 [ORAAGENT(10234)]CRS-5818: Aborted command 'check' for resource 'ora.asm'. Details at (:CRSAGF00113:) {0:10:34} in /u01/app/grid/diag/crs/node19c02/crs/trace/crsd_oraagent_grid.trc.ag/crs/node19c02/crs/trace/crsd_oraagent_grid.trc.
2023-06-06 10:56:30.111 [ORAAGENT(9379)]CRS-5818: Aborted command 'res_attr_modified' for resource 'ora.ons'. Details at (:CRSAGF00113:) {0:6:2} in /u01/app/grid/diag/crs/node19c02/crs/trace/crsd_oraagent_grid.trc.
2023-06-06 10:57:30.737 [ORAAGENT(10450)]CRS-8500: Oracle Clusterware ORAAGENT process is starting with operating system process ID 10450
2023-06-06 10:58:01.065 [ORAAGENT(10450)]CRS-5818: Aborted command 'check' for resource 'ora.asm'. Details at (:CRSAGF00113:) {0:7:2} in /u01/app/grid/diag/crs/node19c02/crs/trace/crsd_oraagent_grid.trc.
2023-06-06 11:07:48.782 [ORAAGENT(11765)]CRS-8500: Oracle Clusterware ORAAGENT process is starting with operating system process ID 11765
2023-06-06 11:11:47.098 [OHASD(1310)]CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'node19c02'
2023-06-06 11:12:20.649 [ORAROOTAGENT(2214)]CRS-5822: Agent '/u01/app/19.0.0/grid/bin/orarootagent_root' disconnected from server. Details at (:CRSAGF00117:) {0:3:70} in /u01/app/grid/diag/crs/node19c02/crs/trace/crsd_orarootagent_root.trc.
2023-06-06 11:12:20.656 [ORAAGENT(10450)]CRS-5822: Agent '/u01/app/19.0.0/grid/bin/oraagent_grid' disconnected from server. Details at (:CRSAGF00117:) {0:7:9} in /u01/app/grid/diag/crs/node19c02/crs/trace/crsd_oraagent_grid.trc.2023-06-06 11:12:37.449 [OCTSSD(2043)]CRS-2405: The Cluster Time Synchronization Service on host node19c02 is shutdown by user
2023-06-06 11:12:37.450 [OCTSSD(2043)]CRS-8504: Oracle Clusterware OCTSSD process with operating system process ID 2043 is exiting
2023-06-06 11:12:38.572 [OCSSD(1712)]CRS-1603: CSSD on node node19c02 has been shut down.
2023-06-06 11:12:41.601 [GPNPD(1551)]CRS-2329: GPNPD on node node19c02 shut down.
2023-06-06 11:13:03.869 [OHASD(1310)]CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'node19c02' has completed
2023-06-06 11:13:03.892 [ORAROOTAGENT(1411)]CRS-5822: Agent '/u01/app/19.0.0/grid/bin/orarootagent_root' disconnected from server. Details at (:CRSAGF00117:) {0:4:15} in /u01/app/grid/diag/crs/node19c02/crs/trace/ohasd_orarootagent_root.trc.2023-06-06 16:15:22.172 [CVUD(1751)]CRS-10051: CVU found following errors with Clusterware setup : PRVG-10048 : Name "node19c01" was not resolved to an address of the specified type by name servers "192.168.2.1".2023-06-06 23:29:08.319 [CVUD(69952)]CRS-10051: CVU found following errors with Clusterware setup : PRVF-5622 : The 'search' entry does not exist in file "/etc/resolv.conf" on nodes: "node19c01".
PRVG-11372 : Number of SCAN IP addresses that SCAN "scan19c" resolved to did not match the number of SCAN VIP resources
PRVG-1101 : SCAN name "scan19c" failed to resolve2023-08-16 13:16:20.626 [ORAAGENT(15597)]CRS-5818: Aborted command 'check' for resource 'ora.asm'. Details at (:CRSAGF00113:) {2:57504:1091} in /u01/app/grid/diag/crs/node19c02/crs/trace/crsd_oraagent_grid.trc.
2023-08-16 13:17:00.706 [ORAAGENT(15597)]CRS-5818: Aborted command 'check' for resource 'ora.asm'. Details at (:CRSAGF00113:) {0:9:2} in /u01/app/grid/diag/crs/node19c02/crs/trace/crsd_oraagent_grid.trc.
2023-08-16 13:17:40.731 [ORAAGENT(15597)]CRS-5818: Aborted command 'check' for resource 'ora.asm'. Details at (:CRSAGF00113:) {0:9:2} in /u01/app/grid/diag/crs/node19c02/crs/trace/crsd_oraagent_grid.trc.
2023-08-16 13:18:06.887 [CVUD(3151)]CRS-10051: CVU found following errors with Clusterware setup : Refer to My Oracle Support notes "1357657.1" for more details regarding errors "PRVG-11067".
Refer to My Oracle Support notes "1357657.1" for more details regarding errors "PRVG-11067".
PRVF-5622 : The 'search' entry does not exist in file "/etc/resolv.conf" on nodes: "node19c01".
Refer to My Oracle Support notes "1357657.1" for more details regarding errors "PRVG-11067".
PRVG-11372 : Number of SCAN IP addresses that SCAN "scan19c" resolved to did not match the number of SCAN VIP resources
PRVG-1101 : SCAN name "scan19c" failed to resolve2023-08-17 08:08:02.411 [CRSD(3019)]CRS-2771: Maximum restart attempts reached for resource 'ora.node19c02.vip'; will not restart.
2023-08-17 08:08:02.737 [ORAAGENT(15554)]CRS-5016: Process "/u01/app/19.0.0/grid/bin/lsnrctl" spawned by agent "ORAAGENT" for action "check" failed: details at "(:CLSN00010:)" in "/u01/app/grid/diag/crs/node19c02/crs/trace/crsd_oraagent_grid.trc"
2023-08-17 08:08:03.537 [GIPCD(1795)]CRS-42216: No interfaces are configured on the local node for interface definition ens34(:.*)?:10.10.10.0: available interface definitions are [ens33(:.*)?:192.168.2.0][ens34:1(:.*)?:169.254.0.0][ens33(:.*)?:[fe80:0:0:0:0:0:0:0]][ens34(:.*)?:[fe80:0:0:0:0:0:0:0]].2023-08-17 08:08:30.185 [GIPCD(1795)]CRS-42216: No interfaces are configured on the local node for interface definition ens34(:.*)?:10.10.10.0: available interface definitions are [ens33(:.*)?:192.168.2.0][ens34:1(:.*)?:169.254.0.0][ens33(:.*)?:[fe80:0:0:0:0:0:0:0]][ens34(:.*)?:[fe80:0:0:0:0:0:0:0]].
2023-08-17 08:08:31.144 [OCSSD(1891)]CRS-1609: This node is unable to communicate with other nodes in the cluster and is going down to preserve cluster integrity; details at (:CSSNM00008:) in /u01/app/grid/diag/crs/node19c02/crs/trace/ocssd.trc.
2023-08-17 08:08:31.186 [OCSSD(1891)]CRS-1656: The CSS daemon is terminating due to a fatal error; Details at (:CSSSC00012:) in /u01/app/grid/diag/crs/node19c02/crs/trace/ocssd.trc2023-08-17 08:08:35.699 [GIPCD(1795)]CRS-42216: No interfaces are configured on the local node for interface definition ens34(:.*)?:10.10.10.0: available interface definitions are [ens33(:.*)?:192.168.2.0][ens34:1(:.*)?:169.254.0.0][ens33(:.*)?:[fe80:0:0:0:0:0:0:0]][ens34(:.*)?:[fe80:0:0:0:0:0:0:0]].
2023-08-17 08:08:36.383 [ORAAGENT(1666)]CRS-5011: Check of resource "ora.asm" failed: details at "(:CLSN00006:)" in "/u01/app/grid/diag/crs/node19c02/crs/trace/ohasd_oraagent_grid.trc"/diag/crs/node19c02/crs/trace/crsd_oraagent_grid.trc.
2023-11-04 11:10:07.792 [OHASD(1553)]CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'node19c02'
2023-11-04 11:10:27.634 [OCSSD(2111)]CRS-1625: Node node19c01, number 1, was shut down
2023-11-04 13:33:01.295 [OCTSSD(2517)]CRS-8504: Oracle Clusterware OCTSSD process with operating system process ID 2517 is exiting
2023-11-04 13:33:02.427 [OCSSD(2111)]CRS-1603: CSSD on node node19c02 has been shut down.
2023-11-04 13:33:04.790 [ORAROOTAGENT(98836)]CRS-8500: Oracle Clusterware ORAROOTAGENT process is starting with operating system process ID 98836
2023-11-04 13:33:05.415 [GPNPD(1902)]CRS-2329: Grid Plug and Play Daemon(GPNPD) on node node19c02 shut down.
2023-11-04 13:33:28.198 [OHASD(1553)]CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'node19c02' has completed

--查看asm实例的log,无异常

-- 查看相关的trc文件,无太多有用的信息

2023-11-08 15:40:23.814 : USRTHRD:930125568: [     INFO] {0:9:2} dumpAsmLsnrReloadVec AsmLsnr Res = ora.ASMNET1LSNR_ASM.lsnr, Reload done = 1
2023-11-08 15:40:24.029 :CLSDYNAM:1897694976: [ora.DATA.dg]{0:9:2} [check] DgpAgent::runCheck 220 check if ASM failed
2023-11-08 15:40:24.029 :CLSDYNAM:1897694976: [ora.DATA.dg]{0:9:2} [check] DgpAgent::queryDgStatus 130 dgName DGStatus is not cached.2023-11-08 15:40:24.030 : USRTHRD:1897694976: [     INFO] {0:9:2} Thread:DGStatusUpdater thread constructor exit this:4c11f680 m_pThnd:0 m_thndMX:4c11f6a0, m_tintMX:4c11f6f0 &m_postMX:0x7f6b4c11f6d0
2023-11-08 15:40:24.030 :CLSDYNAM:1912403712: [ora.OCR.dg]{0:9:2} [check] DgpAgent::runCheck 220 check if ASM failed
2023-11-08 15:40:24.030 :CLSDYNAM:1912403712: [ora.OCR.dg]{0:9:2} [check] DgpAgent::queryDgStatus 130 dgName DGStatus is not cached.

-- 查看节点1 和节点2上的日志,查找resolv.conf。发现有很多解析dns的相关信息,很多网关192.168.2.1的信息

[grid@node19c01 trace]$ more * | grep resolv.conf
2020-05-01 11:32:37.545 [CVUD(3455)]CRS-10051: CVU found following errors with Clusterware setup : PRVF-5622 : The 'search' entry does not exist in file "/etc/resolv.conf" on nodes: "node19c01".
2020-05-01 12:29:34.334 [CVUD(4030)]CRS-10051: CVU found following errors with Clusterware setup : PRVF-5622 : The 'search' entry does not exist in file "/etc/resolv.conf" on nodes: "node19c01".
2023-11-08 16:46:43.956 [CVUD(42984)]CRS-10051: CVU found following errors with Clusterware setup : PRVF-5622 : The 'search' entry does not exist in file "/etc/resolv.conf" on nodes: "node19c01".
[grid@node19c01 trace]$[root@node19c02 trace]# more * | grep resolv.conf
2023-06-06 16:15:22.170 [CVUD(1751)]CRS-10051: CVU found following errors with Clusterware setup : PRVF-5622 : The 'search' entry does not exist in file "/etc/resolv.conf" on nodes: "node19c01".
PRVF-5622 : The 'search' entry does not exist in file "/etc/resolv.conf" on nodes: "node19c01".
2023-06-06 23:29:08.319 [CVUD(69952)]CRS-10051: CVU found following errors with Clusterware setup : PRVF-5622 : The 'search' entry does not exist in file "/etc/resolv.conf" on nodes: "node19c01".
PRVF-5622 : The 'search' entry does not exist in file "/etc/resolv.conf" on nodes: "node19c01".
PRVF-5622 : The 'search' entry does not exist in file "/etc/resolv.conf" on nodes: "node19c01".
2023-08-17 09:00:22.018 [CVUD(3098)]CRS-10051: CVU found following errors with Clusterware setup : PRVF-5622 : The 'search' entry does not exist in file "/etc/resolv.conf" on nodes: "node19c01".
2023-10-24 09:58:56.141 [CVUD(4051)]CRS-10051: CVU found following errors with Clusterware setup : PRVF-5622 : The 'search' entry does not exist in file "/etc/resolv.conf" on nodes: "node19c01".
[root@node19c02 trace]#[grid@node19c01 trace]$ more alert.log | grep 192.168.2.1
PRVG-10048 : Name "node19c01" was not resolved to an address of the specified type by name servers "192.168.2.1".
PRVG-10048 : Name "node19c02" was not resolved to an address of the specified type by name servers "192.168.2.1".
PRVG-10048 : Name "node19c01" was not resolved to an address of the specified type by name servers "192.168.2.1".
PRVG-10048 : Name "node19c01" was not resolved to an address of the specified type by name servers "192.168.2.1".
PRVG-10048 : Name "node19c02" was not resolved to an address of the specified type by name servers "192.168.2.1".
PRVG-10048 : Name "node19c01" was not resolved to an address of the specified type by name servers "192.168.2.1".
[grid@node19c01 trace]$[root@node19c02 trace]# more alert.log | grep 192.168.2.1
PRVG-10048 : Name "node19c02" was not resolved to an address of the specified type by name servers "192.168.2.1".
PRVG-10048 : Name "node19c02" was not resolved to an address of the specified type by name servers "192.168.2.1".
2023-06-06 16:15:22.172 [CVUD(1751)]CRS-10051: CVU found following errors with Clusterware setup : PRVG-10048 : Name "node19c01" was not resolved to an address of the specified type by name servers "192.168.2.1".
PRVG-10048 : Name "node19c02" was not resolved to an address of the specified type by name servers "192.168.2.1".
PRVG-10048 : Name "node19c01" was not resolved to an address of the specified type by name servers "192.168.2.1".
PRVG-10048 : Name "node19c02" was not resolved to an address of the specified type by name servers "192.168.2.1".
2023-06-06 23:29:08.320 [CVUD(69952)]CRS-10051: CVU found following errors with Clusterware setup : PRVG-10048 : Name "node19c01" was not resolved to an address of the specified type by name servers "192.168.2.1".
[root@node19c02 trace]#

-- 查看dns的配置。OS上配置了dns

[root@node19c02 trace]# cat /etc/resolv.conf
# Generated by NetworkManager
search localdomain
nameserver 192.168.2.1
nameserver 192.168.71.2
[root@node19c02 trace]#[grid@node19c01 trace]$ cat /etc/resolv.conf
# Generated by NetworkManager
nameserver 192.168.2.1
[grid@node19c01 trace]$

--查看网卡的配置,配置了网关192.168.2.1

[grid@node19c01 network-scripts]$ more ifcfg-ens33 | grep GATEWAY
GATEWAY=192.168.2.1
[grid@node19c01 network-scripts]$[root@node19c02 network-scripts]# more ifcfg-ens33 | grep GATEWAY
GATEWAY=192.168.2.1
[root@node19c02 network-scripts]#

-- 解决方法
1 清理掉/etc/resolv.conf里面的信息,即,取消dns的配置(这个是主要原因)
2 网卡里面的网关,去掉。

去掉dns信息和网关信息后,启动和关闭集群,正常。可以很快启动,很快关闭掉。

END

这篇关于记录一次19c RAC启动关闭很慢的问题的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/377007

相关文章

Oracle查询优化之高效实现仅查询前10条记录的方法与实践

《Oracle查询优化之高效实现仅查询前10条记录的方法与实践》:本文主要介绍Oracle查询优化之高效实现仅查询前10条记录的相关资料,包括使用ROWNUM、ROW_NUMBER()函数、FET... 目录1. 使用 ROWNUM 查询2. 使用 ROW_NUMBER() 函数3. 使用 FETCH FI

关于@MapperScan和@ComponentScan的使用问题

《关于@MapperScan和@ComponentScan的使用问题》文章介绍了在使用`@MapperScan`和`@ComponentScan`时可能会遇到的包扫描冲突问题,并提供了解决方法,同时,... 目录@MapperScan和@ComponentScan的使用问题报错如下原因解决办法课外拓展总结@

MybatisGenerator文件生成不出对应文件的问题

《MybatisGenerator文件生成不出对应文件的问题》本文介绍了使用MybatisGenerator生成文件时遇到的问题及解决方法,主要步骤包括检查目标表是否存在、是否能连接到数据库、配置生成... 目录MyBATisGenerator 文件生成不出对应文件先在项目结构里引入“targetProje

C#使用HttpClient进行Post请求出现超时问题的解决及优化

《C#使用HttpClient进行Post请求出现超时问题的解决及优化》最近我的控制台程序发现有时候总是出现请求超时等问题,通常好几分钟最多只有3-4个请求,在使用apipost发现并发10个5分钟也... 目录优化结论单例HttpClient连接池耗尽和并发并发异步最终优化后优化结论我直接上优化结论吧,

Java内存泄漏问题的排查、优化与最佳实践

《Java内存泄漏问题的排查、优化与最佳实践》在Java开发中,内存泄漏是一个常见且令人头疼的问题,内存泄漏指的是程序在运行过程中,已经不再使用的对象没有被及时释放,从而导致内存占用不断增加,最终... 目录引言1. 什么是内存泄漏?常见的内存泄漏情况2. 如何排查 Java 中的内存泄漏?2.1 使用 J

Python MySQL如何通过Binlog获取变更记录恢复数据

《PythonMySQL如何通过Binlog获取变更记录恢复数据》本文介绍了如何使用Python和pymysqlreplication库通过MySQL的二进制日志(Binlog)获取数据库的变更记录... 目录python mysql通过Binlog获取变更记录恢复数据1.安装pymysqlreplicat

numpy求解线性代数相关问题

《numpy求解线性代数相关问题》本文主要介绍了numpy求解线性代数相关问题,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的朋友们下面随着小编来一起学习学习吧... 在numpy中有numpy.array类型和numpy.mat类型,前者是数组类型,后者是矩阵类型。数组

解决systemctl reload nginx重启Nginx服务报错:Job for nginx.service invalid问题

《解决systemctlreloadnginx重启Nginx服务报错:Jobfornginx.serviceinvalid问题》文章描述了通过`systemctlstatusnginx.se... 目录systemctl reload nginx重启Nginx服务报错:Job for nginx.javas

怎么关闭Ubuntu无人值守升级? Ubuntu禁止自动更新的技巧

《怎么关闭Ubuntu无人值守升级?Ubuntu禁止自动更新的技巧》UbuntuLinux系统禁止自动更新的时候,提示“无人值守升级在关机期间,请不要关闭计算机进程”,该怎么解决这个问题?详细请看... 本教程教你如何处理无人值守的升级,即 Ubuntu linux 的自动系统更新。来源:https://

Redis缓存问题与缓存更新机制详解

《Redis缓存问题与缓存更新机制详解》本文主要介绍了缓存问题及其解决方案,包括缓存穿透、缓存击穿、缓存雪崩等问题的成因以及相应的预防和解决方法,同时,还详细探讨了缓存更新机制,包括不同情况下的缓存更... 目录一、缓存问题1.1 缓存穿透1.1.1 问题来源1.1.2 解决方案1.2 缓存击穿1.2.1