11.2.0.4,ASM实例异常宕机,ORA-29740: evicted by instance number 2

2023-10-11 05:48

本文主要是介绍11.2.0.4,ASM实例异常宕机,ORA-29740: evicted by instance number 2,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

一、环境背景

oracle 11.2.0.4  + RAC + RedHat5.8

二、故障描述

ASM实例异常宕机,报错信息如下:

Sun Sep 14 20:27:13 2014
IPC Send timeout detected. Sender: ospid 15909 [oracle@wyjkdb01 (RBAL)]
Receiver: inst 2 binc 458423771 ospid 20030
IPC Send timeout to 2.0 inc 4 for msg type 8 from opid 18
Sun Sep 14 20:27:17 2014
Suppressed nested communications reconfiguration: instance_number 2
Detected an inconsistent instance membership by instance 2
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_lmon_15889.trc (incident=316873):
ORA-29740: evicted by instance number 2, group incarnation 6
Incident details in: /u01/app/grid/diag/asm/+asm/+ASM1/incident/incdir_316873/+ASM1_lmon_15889_i316873.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_lmon_15889.trc:
ORA-29740: evicted by instance number 2, group incarnation 6
LMON (ospid: 15889): terminating the instance due to error 29740

Sun Sep 14 20:27:22 2014
System state dump requested by (instance=1, osid=15889 (LMON)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_diag_15883_20140914202722.trc
Dumping diagnostic data in directory=[cdmp_20140914202719], requested by (instance=1, osid=15889 (LMON)), summary=[abnormal instance termination].
Instance terminated by LMON, pid = 15889
Sun Sep 14 20:27:24 2014
MEMORY_TARGET defaulting to 1128267776.
* instance_number obtained from CSS = 1, checking for the existence of node 0...
* node 0 does not exist. instance_number = 1

网络异常,被正常的ASM实例踢除集群

 

三、问题分析

正常ASM日志报错如下:

Sun Sep 14 20:27:13 2014
IPC Send timeout detected. Sender: ospid 20048 [oracle@wyjkdb02 (RBAL)]
Receiver: inst 1 binc 429906001 ospid 15891
Sun Sep 14 20:27:14 2014
IPC Send timeout detected. Sender: ospid 20030 [oracle@wyjkdb02 (LMD0)]
Receiver: inst 1 binc 429906001 ospid 15891
IPC Send timeout to 1.0 inc 4 for msg type 65521 from opid 10
Sun Sep 14 20:27:14 2014
Communications reconfiguration: instance_number 1
Sun Sep 14 20:27:16 2014
IPC Send timeout detected. Receiver ospid 20030 [
Errors in file /u01/app/grid/diag/asm/+asm/+ASM2/trace/+ASM2_lmd0_20030.trc:
IPC Send timeout to 1.0 inc 4 for msg type 16 from opid 18
Detected an inconsistent instance membership by instance 2
Evicting instance 1 from cluster
Waiting for instances to leave: 1
Dumping diagnostic data in directory=[cdmp_20140914202719], requested by (instance=1, osid=15889 (LMON)), summary=[abnormal instance termination].
Reconfiguration started (old inc 4, new inc 8)
List of instances:
2 (myinst: 2)
Global Resource Directory frozen
* dead instance detected - domain 1 invalid = TRUE
* dead instance detected - domain 2 invalid = TRUE
Communication channels reestablished
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Sun Sep 14 20:27:23 2014
LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Post SMON to start 1st pass IR
Submitted all GCS remote-cache requests
Post SMON to start 1st pass IR
Fix write in gcs resources
Reconfiguration complete
Sun Sep 14 20:27:23 2014
NOTE: SMON starting instance recovery for group DATA domain 1 (mounted)
NOTE: F1X0 found on disk 0 au 2 fcn 0.218420
NOTE: starting recovery of thread=2 ckpt=15.764 group=1 (DATA)
NOTE: SMON waiting for thread 2 recovery enqueue
NOTE: SMON about to begin recovery lock claims for diskgroup 1 (DATA)
NOTE: SMON successfully validated lock domain 1
NOTE: advancing ckpt for group 1 (DATA) thread=2 ckpt=15.764
NOTE: SMON did instance recovery for group DATA domain 1
NOTE: SMON starting instance recovery for group OCR_VOTE domain 2 (mounted)

IPC超时,将ASM实例踢除集群

 

CRS日志信息:

2014-09-14 20:21:57.640:
[/u01/app/11.2.0/grid/bin/orarootagent.bin(18720)]CRS-5018:(:CLSN00037:) Removed unused HAIP route:169.254.95.0 / 255.255.255.0 / 0.0.0.0 /usb0

usb网口的网段是169.254.95

 

操作系统日志/var/log/messages

Sep 14 20:21:43 wyjkdb02 last message repeated 2 times
Sep 14 20:21:55 wyjkdb02 avahi-daemon[14864]: Withdrawing address record for 169.254.95.120 on usb0.
Sep 14 20:21:55 wyjkdb02 avahi-daemon[14864]: Leaving mDNS multicast group on interface usb0.IPv4 with address 169.254.95.120.
Sep 14 20:21:55 wyjkdb02 avahi-daemon[14864]: iface.c: interface_mdns_mcast_join() called but no local address available.
Sep 14 20:21:55 wyjkdb02 avahi-daemon[14864]: Interface usb0.IPv4 no longer relevant for mDNS.
Sep 14 20:21:55 wyjkdb02 avahi-daemon[14864]: Withdrawing address record for fe80::40f2:e9ff:feda:1101 on usb0.
Sep 14 20:21:55 wyjkdb02 avahi-daemon[14864]: Leaving mDNS multicast group on interface usb0.IPv6 with address fe80::40f2:e9ff:feda:1101.
Sep 14 20:21:55 wyjkdb02 avahi-daemon[14864]: iface.c: interface_mdns_mcast_join() called but no local address available.
Sep 14 20:21:55 wyjkdb02 avahi-daemon[14864]: Interface usb0.IPv6 no longer relevant for mDNS.
Sep 14 20:21:56 wyjkdb02 dhclient: DHCPREQUEST on usb0 to 255.255.255.255 port 67 (xid=0x175227ff)
Sep 14 20:21:56 wyjkdb02 dhclient: DHCPACK from 169.254.95.118 (xid=0x175227ff)

Sep 14 20:21:56 wyjkdb02 avahi-daemon[14864]: New relevant interface usb0.IPv4 for mDNS.
Sep 14 20:21:56 wyjkdb02 avahi-daemon[14864]: Joining mDNS multicast group on interface usb0.IPv4 with address 169.254.95.120.
Sep 14 20:21:56 wyjkdb02 dhclient: bound to 169.254.95.120 -- renewal in 244 seconds.
Sep 14 20:21:56 wyjkdb02 avahi-daemon[14864]: Registering new address record for 169.254.95.120 on usb0.
Sep 14 20:21:56 wyjkdb02 avahi-daemon[14864]: Withdrawing address record for 169.254.95.120 on usb0.
Sep 14 20:21:56 wyjkdb02 avahi-daemon[14864]: Leaving mDNS multicast group on interface usb0.IPv4 with address 169.254.95.120.
Sep 14 20:21:56 wyjkdb02 avahi-daemon[14864]: iface.c: interface_mdns_mcast_join() called but no local address available.
Sep 14 20:21:56 wyjkdb02 avahi-daemon[14864]: Interface usb0.IPv4 no longer relevant for mDNS.
Sep 14 20:21:56 wyjkdb02 avahi-daemon[14864]: New relevant interface usb0.IPv4 for mDNS.
Sep 14 20:21:56 wyjkdb02 avahi-daemon[14864]: Joining mDNS multicast group on interface usb0.IPv4 with address 169.254.95.120.
Sep 14 20:21:56 wyjkdb02 avahi-daemon[14864]: Registering new address record for 169.254.95.120 on usb0.
Sep 14 20:21:57 wyjkdb02 avahi-daemon[14864]: New relevant interface usb0.IPv6 for mDNS.
Sep 14 20:21:57 wyjkdb02 avahi-daemon[14864]: Joining mDNS multicast group on interface usb0.IPv6 with address fe80::40f2:e9ff:feda:1101.
Sep 14 20:21:57 wyjkdb02 avahi-daemon[14864]: Registering new address record for fe80::40f2:e9ff:feda:1101 on usb0.
Sep 14 20:22:10 wyjkdb02 cimserver[13651]: Listening on HTTP port 15988.
Sep 14 20:22:10 wyjkdb02 cimserver[13651]: Listening on HTTPS port 15989.
Sep 14 20:22:10 wyjkdb02 cimserver[13651]: Listening on local connection socket.
Sep 14 20:22:10 wyjkdb02 cimserver[13651]: Started CIM Server version 2.11.0.
Sep 14 20:22:10 wyjkdb02 Director Agent: ADPT_NoControllers wyjkdb02 No controllers found in the system. Sev: 2.
Sep 14 20:22:10 wyjkdb02 Director Agent: ADPT_NoControllers wyjkdb02 No controllers found in the system. Sev: 2.
Sep 14 20:22:50 wyjkdb02 cimserver[13651]: CIM Server registration with External SLP Failed. Exception: connection timed out
Sep 14 20:26:00 wyjkdb02 dhclient: DHCPREQUEST on usb0 to 169.254.95.118 port 67 (xid=0x175227ff)
Sep 14 20:26:40 wyjkdb02 last message repeated 4 times
Sep 14 20:27:47 wyjkdb02 last message repeated 6 times
Sep 14 20:28:48 wyjkdb02 last message repeated 4 times
Sep 14 20:30:05 wyjkdb02 last message repeated 5 times
Sep 14 20:30:40 wyjkdb02 last message repeated 3 times

 

#ifconfig  -a
eth0      Link encap:Ethernet  HWaddr 90:E2:BA:62:61:E8 
          inet addr:10.0.3.6  Bcast:10.0.3.255  Mask:255.255.255.0
          inet6 addr: fe80::92e2:baff:fe62:61e8/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:5500577 errors:0 dropped:0 overruns:0 frame:0
          TX packets:5004018 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:4161309900 (3.8 GiB)  TX bytes:747587111 (712.9 MiB)
          Memory:c5300000-c5380000

eth0:1    Link encap:Ethernet  HWaddr 90:E2:BA:62:61:E8 
          inet addr:10.0.3.8  Bcast:10.0.3.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          Memory:c5300000-c5380000

eth1      Link encap:Ethernet  HWaddr 90:E2:BA:62:61:E9 
          BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
          Memory:c5380000-c5400000

eth2      Link encap:Ethernet  HWaddr 40:F2:E9:DA:16:8A 
          inet addr:173.12.3.6  Bcast:173.12.3.255  Mask:255.255.255.0
          inet6 addr: fe80::42f2:e9ff:feda:168a/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:53086008 errors:0 dropped:0 overruns:0 frame:0
          TX packets:45032496 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:36212358508 (33.7 GiB)  TX bytes:23821830145 (22.1 GiB)
          Memory:c5580000-c55a0000

eth2:1    Link encap:Ethernet  HWaddr 40:F2:E9:DA:16:8A 
          inet addr:169.254.151.223
  Bcast:169.254.255.255  Mask:255.255.0.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          Memory:c5580000-c55a0000

eth3      Link encap:Ethernet  HWaddr 40:F2:E9:DA:16:8B 
          BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
          Memory:c55a0000-c55c0000

eth4      Link encap:Ethernet  HWaddr 40:F2:E9:DA:16:8C 
          BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
          Memory:c55c0000-c55e0000

eth5      Link encap:Ethernet  HWaddr 40:F2:E9:DA:16:8D 
          BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
          Memory:c55e0000-c5600000

lo        Link encap:Local Loopback 
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:65323187 errors:0 dropped:0 overruns:0 frame:0
          TX packets:65323187 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:8009448558 (7.4 GiB)  TX bytes:8009448558 (7.4 GiB)

sit0      Link encap:IPv6-in-IPv4 
          NOARP  MTU:1480  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

usb0      Link encap:Ethernet  HWaddr 42:F2:E9:DA:16:89 
          inet addr:169.254.95.120  Bcast:169.254.95.255  Mask:255.255.255.0
          inet6 addr: fe80::40f2:e9ff:feda:1689/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:83931 errors:0 dropped:0 overruns:0 frame:0
          TX packets:11563 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:6779323 (6.4 MiB)  TX bytes:936612 (914.6 KiB)

 

usb0网口的IP为169.254.95.120

 

据悉,usb0口是系统工程师为了方便管理用(网络直连至服务器).但11.2.0.4.2以后,RAC采用HAIP做心跳冗余,默认是用网段169.254.X.X

到此便知,是由于usb0网口使RAC的HAIP混乱致使ASM实例被踢除集群。

 

 

四、解决方法

1. 禁用usb0自动获取IP,或手工分配IP地址

2. 重启集群服务

 

-------------------------------------------------------------------------------------------------

本文来自于我的技术博客 http://blog.csdn.net/robo23

转载请标注源文链接,否则追究法律责任!

 

这篇关于11.2.0.4,ASM实例异常宕机,ORA-29740: evicted by instance number 2的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/185974

相关文章

Oracle Expdp按条件导出指定表数据的方法实例

《OracleExpdp按条件导出指定表数据的方法实例》:本文主要介绍Oracle的expdp数据泵方式导出特定机构和时间范围的数据,并通过parfile文件进行条件限制和配置,文中通过代码介绍... 目录1.场景描述 2.方案分析3.实验验证 3.1 parfile文件3.2 expdp命令导出4.总结

Python中异常类型ValueError使用方法与场景

《Python中异常类型ValueError使用方法与场景》:本文主要介绍Python中的ValueError异常类型,它在处理不合适的值时抛出,并提供如何有效使用ValueError的建议,文中... 目录前言什么是 ValueError?什么时候会用到 ValueError?场景 1: 转换数据类型场景

Spring中Bean有关NullPointerException异常的原因分析

《Spring中Bean有关NullPointerException异常的原因分析》在Spring中使用@Autowired注解注入的bean不能在静态上下文中访问,否则会导致NullPointerE... 目录Spring中Bean有关NullPointerException异常的原因问题描述解决方案总结

MySQL的索引失效的原因实例及解决方案

《MySQL的索引失效的原因实例及解决方案》这篇文章主要讨论了MySQL索引失效的常见原因及其解决方案,它涵盖了数据类型不匹配、隐式转换、函数或表达式、范围查询、LIKE查询、OR条件、全表扫描、索引... 目录1. 数据类型不匹配2. 隐式转换3. 函数或表达式4. 范围查询之后的列5. like 查询6

Python中的异步:async 和 await以及操作中的事件循环、回调和异常

《Python中的异步:async和await以及操作中的事件循环、回调和异常》在现代编程中,异步操作在处理I/O密集型任务时,可以显著提高程序的性能和响应速度,Python提供了asyn... 目录引言什么是异步操作?python 中的异步编程基础async 和 await 关键字asyncio 模块理论

详解Python中通用工具类与异常处理

《详解Python中通用工具类与异常处理》在Python开发中,编写可重用的工具类和通用的异常处理机制是提高代码质量和开发效率的关键,本文将介绍如何将特定的异常类改写为更通用的ValidationEx... 目录1. 通用异常类:ValidationException2. 通用工具类:Utils3. 示例文

Python开发围棋游戏的实例代码(实现全部功能)

《Python开发围棋游戏的实例代码(实现全部功能)》围棋是一种古老而复杂的策略棋类游戏,起源于中国,已有超过2500年的历史,本文介绍了如何用Python开发一个简单的围棋游戏,实例代码涵盖了游戏的... 目录1. 围棋游戏概述1.1 游戏规则1.2 游戏设计思路2. 环境准备3. 创建棋盘3.1 棋盘类

无人叉车3d激光slam多房间建图定位异常处理方案-墙体画线地图切分方案

墙体画线地图切分方案 针对问题:墙体两侧特征混淆误匹配,导致建图和定位偏差,表现为过门跳变、外月台走歪等 ·解决思路:预期的根治方案IGICP需要较长时间完成上线,先使用切分地图的工程化方案,即墙体两侧切分为不同地图,在某一侧只使用该侧地图进行定位 方案思路 切分原理:切分地图基于关键帧位置,而非点云。 理论基础:光照是直线的,一帧点云必定只能照射到墙的一侧,无法同时照到两侧实践考虑:关

MySQL数据库宕机,启动不起来,教你一招搞定!

作者介绍:老苏,10余年DBA工作运维经验,擅长Oracle、MySQL、PG、Mongodb数据库运维(如安装迁移,性能优化、故障应急处理等)公众号:老苏畅谈运维欢迎关注本人公众号,更多精彩与您分享。 MySQL数据库宕机,数据页损坏问题,启动不起来,该如何排查和解决,本文将为你说明具体的排查过程。 查看MySQL error日志 查看 MySQL error日志,排查哪个表(表空间

usaco 1.2 Name That Number(数字字母转化)

巧妙的利用code[b[0]-'A'] 将字符ABC...Z转换为数字 需要注意的是重新开一个数组 c [ ] 存储字符串 应人为的在末尾附上 ‘ \ 0 ’ 详见代码: /*ID: who jayLANG: C++TASK: namenum*/#include<stdio.h>#include<string.h>int main(){FILE *fin = fopen (