Oracle 11g Data Guard Error 16143 Heartbeat failed to connect to standby 处理方法

本文主要是介绍Oracle 11g Data Guard Error 16143 Heartbeat failed to connect to standby 处理方法,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!


 

 

一.背景说明

 

前段时间一朋友在生产库上误操作,本来他是打算重启一下DG环境,结果在备库命令执行错误。

 

本应该执行

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL; 

 

结果朋友执行成了如下命令:

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH; 

 

中断了DG主备库的通信环境,这个finish是用来做Failover时用的。 当时让朋友在主库重新生成了一份standby controlfiles,然后copy到备库,在按正常模式启动就可以了。

   

    因为数据库识别主备库就是通过控制文件来的,所以理论上,只需要重新生成一份standby 控制文件就可以了。  后来朋友测试了一下,正常的拉起来了。

 

今天看到了当时的记录,就顺便模拟一下整个操作,顺便练练手。

 

 

 

二. 演示过程

 

2.1 DG 环境说明

OS: Oracle Linux6.3

DB: 11.2.0.3

 

SQL> select * from v$version;

 

BANNER

--------------------------------------------------------------------------------

Oracle Database 11g Enterprise EditionRelease 11.2.0.3.0 - 64bit Production

PL/SQL Release 11.2.0.3.0 - Production

CORE   11.2.0.3.0      Production

TNS for Linux: Version 11.2.0.3.0 -Production

NLSRTL Version 11.2.0.3.0 - Production

 

 

主库:

 

SQL> select open_mode from v$database;

 

OPEN_MODE

--------------------

READ WRITE

 

SQL>

SQL> set pagesize 200

SQL> select sequence#,applied fromv$archived_log order by sequence# desc;

 

 SEQUENCE# APPLIED

---------- ---------

       14 YES

       14 NO

       13 YES

       13 NO

       12 NO

       12 YES

       11 YES

       11 NO

       10 NO

       10 YES

        9 YES

        9 NO

        8 NO

        8 YES

        7 YES

        7 NO

        6 YES

        6 NO

        5 NO

        4 NO

 

20 rows selected.

 

 

备库:

SQL> select open_mode from v$database;

 

OPEN_MODE

--------------------

MOUNTED

 

SQL>

 

 

SQL> select sequence#,applied fromv$archived_log order by sequence# desc;

 

 SEQUENCE# APPLIED

---------- ---------

       14 YES

       13 YES

       12 YES

       11 YES

       10 YES

        9 YES

         8 YES

        7 YES

        6 YES

 

9 rows selected.

 

 

2.2 模拟故障

 

在备库执行如下命令:

SQL> ALTER DATABASE RECOVER MANAGEDSTANDBY DATABASE FINISH;

Database altered.

 

 

2.3 查看主库 alert log

 

[oracle@dg1 trace]$ pwd

/u01/app/oracle/diag/rdbms/dave_pd/dave/trace

 

[oracle@dg1 trace]$ tail -30 alert_dave.log

Thread 1 advanced to log sequence 14 (LGWRswitch)

 Current log# 2 seq# 14 mem# 0: /u01/app/oracle/oradata/dave/redo02.log

Fri Mar 29 03:30:12 2013

Archived Log entry 17 added for thread 1sequence 13 ID 0x3312f7c4 dest 1:

Fri Mar 29 03:30:13 2013

LNS: Standby redo logfile selected forthread 1 sequence 14 for destination LOG_ARCHIVE_DEST_2

Fri Mar 29 03:43:10 2013

Time drift detected. Please check VKTMtrace file for more details.

Fri Mar 29 04:45:31 2013

Time drift detected. Please check VKTMtrace file for more details.

Fri Mar 29 06:28:35 2013

Time drift detected. Please check VKTMtrace file for more details.

Fri Mar 29 07:08:14 2013

Thread 1 advanced to log sequence 15 (LGWRswitch)

 Current log# 3 seq# 15 mem# 0: /u01/app/oracle/oradata/dave/redo03.log

Fri Mar 29 07:08:16 2013

Archived Log entry 20 added for thread 1sequence 14 ID 0x3312f7c4 dest 1:

Fri Mar 29 07:08:17 2013

LNS: Standby redo logfile selected forthread 1 sequence 15 for destination LOG_ARCHIVE_DEST_2

Fri Mar 29 07:34:48 2013

Time drift detected. Please check VKTMtrace file for more details.

Fri Mar 29 07:48:55 2013

LNS: Attempting destinationLOG_ARCHIVE_DEST_2 network reconnect (3135)

LNS: Destination LOG_ARCHIVE_DEST_2 networkreconnect abandoned

Error 3135 for archive log file 3 to'dave_st'

Errors in file/u01/app/oracle/diag/rdbms/dave_pd/dave/trace/dave_nsa2_3181.trc:

ORA-03135: connection lost contact

LNS: Failed to archive log 3 thread 1sequence 15 (3135)

Fri Mar 29 07:51:45 2013

PING[ARC1]: Heartbeatfailed to connect to standby 'dave_st'. Error is 16143.

 

因为我们在备库执行的Finish命令,导致心跳中断了。

 

2.4 查看备库alert log

 

[oracle@dg2 trace]$ pwd

/u01/app/oracle/diag/rdbms/dave_st/dave/trace

 

[oracle@dg2 trace]$  tail -20 alert_dave.log

Terminal Recovery: thread 1 seq# 15 redorequired

Terminal Recovery:

Recovery of Online Redo Log: Thread 1 Group5 Seq 15 Reading mem 0

 Mem# 0: /u01/app/oracle/oradata/dave/stdbyredo02.log

Identified End-Of-Redo (failover) forthread 1 sequence 15 at SCN 0xffff.ffffffff

Incomplete Recovery applied until change1082890 time 03/29/2013 07:48:53

MRP0: Media Recovery Complete (dave)

Terminal Recovery: successful completion

Fri Mar 29 07:48:49 2013

ARCH: Archival stopped, error occurred.Will continue retrying

ORACLE Instance dave - Archival Error

Forcing ARSCN to IRSCN for TR 0:1082890

Attempt to set limbo arscn 0:1082890 irscn0:1082890

Resetting standby activation ID 856881092(0x3312f7c4)

ORA-16014: log 5 sequence# 15 not archived,no available destinations

ORA-00312: online log 5 thread 1:'/u01/app/oracle/oradata/dave/stdbyredo02.log'

MRP0: Background Media Recovery processshutdown (dave)

Fri Mar 29 07:48:50 2013

Terminal Recovery: completion detected(dave)

Completed: ALTER DATABASE RECOVER MANAGEDSTANDBY DATABASE FINISH

[oracle@dg2 trace]$

 

2.5 在主库切换归档

 

SQL> alter system switch logfile;

System altered.

 

SQL> alter system switch logfile;

System altered.

 

SQL> select sequence#,applied fromv$archived_log order by sequence# desc;

 

 SEQUENCE# APPLIED

---------- ---------

       16 NO

       15 NO

       14 NO

       14 YES

       13 YES

       13 NO

       12 YES

       12 NO

       11 NO

       11 YES

       10 NO

       10 YES

        9 YES

        9 NO

        8 YES

        8 NO

         7 YES

        7 NO

        6 NO

        6 YES

        5 NO

        4 NO

 

22 rows selected.

 

SQL>

 

2.6 再次查看主备库日志

 

主库日志:

 

Fri Mar 29 07:52:46 2013

PING[ARC1]: Heartbeat failed to connect tostandby 'dave_st'. Error is 16143.

Fri Mar 29 07:53:47 2013

PING[ARC1]: Heartbeat failed to connect tostandby 'dave_st'. Error is 16143.

Fri Mar 29 07:53:49 2013

Thread 1 advanced to log sequence 16 (LGWRswitch)

 Current log# 1 seq# 16 mem# 0: /u01/app/oracle/oradata/dave/redo01.log

Fri Mar 29 07:53:49 2013

Archived Log entry 21 added for thread 1sequence 15 ID 0x3312f7c4 dest 1:

Fri Mar 29 07:53:50 2013

FAL[server, ARC2]: Error 16143 creatingremote archivelog file 'dave_st'

FAL[server, ARC2]: FAL archive failed, seetrace file.

ARCH: FAL archive failed. Archivercontinuing

ORACLE Instance dave - Archival Error.Archiver continuing.

Thread 1 advanced to log sequence 17 (LGWRswitch)

 Current log# 2 seq# 17 mem# 0: /u01/app/oracle/oradata/dave/redo02.log

Fri Mar 29 07:53:57 2013

Archived Log entry 22 added for thread 1sequence 16 ID 0x3312f7c4 dest 1:

 

 

备库日志:

 

Fri Mar 29 07:48:50 2013

Terminal Recovery: completion detected(dave)

Completed: ALTER DATABASE RECOVER MANAGEDSTANDBY DATABASE FINISH

Fri Mar 29 07:51:34 2013

RFS[5]: Assigned to RFS process 9336

RFS[5]: No connections allowed during/afterterminal recovery.

Fri Mar 29 07:52:35 2013

RFS[6]: Assigned to RFS process 9340

RFS[6]: No connections allowed during/afterterminal recovery.

Fri Mar 29 07:53:36 2013

RFS[7]: Assigned to RFS process 9343

RFS[7]: No connections allowed during/afterterminal recovery.

Fri Mar 29 07:53:39 2013

RFS[8]: Assigned to RFS process 9345

RFS[8]: No connectionsallowed during/after terminal recovery.

 

 

2.7 在主库重建standby control file

 

先在备库查看一下控制文件名称,等会创建完后直接覆盖过去:

SQL> show parameter control

 

NAME                                 TYPE        VALUE

----------------------------------------------- ------------------------------

control_file_record_keep_time        integer     7

control_files                        string      /u01/app/oracle/oradata/dave/c

                                                ontrol01.ctl, /u01/app/oracle/

                                                fast_recovery_area/dave/contro

                                                 l02.ctl

control_management_pack_access       string      DIAGNOSTIC+TUNING

 

 

主库创建standby controlfile

SQL> alter database create standbycontrolfile as '/u01/control01.ctl';

Database altered.

 

 

copy到备库的目录,在覆盖原来的控制文件:

 

--先关闭备库:

SQL> shutdown immediate

ORA-01109: database not open

 

Database dismounted.

ORACLE instance shut down.

SQL>

 

 

--copy并覆盖:

[oracle@dg2 trace]$ cd/u01/app/oracle/oradata/dave/

[oracle@dg2 dave]$ ls

control01.ctl    stdbyredo02.log  stdbyredo04.log  system01.dbf undotbs01.dbf

stdbyredo01.log  stdbyredo03.log  sysaux01.dbf     temp01.dbf    users01.dbf

[oracle@dg2 dave]$ mv control01.ctlcontrol01.ctl.bak

[oracle@dg2 dave]$ ls

control01.ctl.bak  stdbyredo02.log  stdbyredo04.log  system01.dbf undotbs01.dbf

stdbyredo01.log    stdbyredo03.log  sysaux01.dbf     temp01.dbf    users01.dbf

 

[oracle@dg2 dave]$ scp192.168.1.20:/u01/control01.ctl 192.168.1.30:/u01/app/oracle/oradata/dave/

The authenticity of host '192.168.1.20(192.168.1.20)' can't be established.

RSA key fingerprint is0d:6a:5f:78:53:a0:bf:54:a8:e3:7e:67:81:06:8d:75.

Are you sure you want to continueconnecting (yes/no)? yes

Warning: Permanently added '192.168.1.20'(RSA) to the list of known hosts.

oracle@192.168.1.20's password:

oracle@192.168.1.30's password:

control01.ctl                                                                        100% 9520KB 865.5KB/s   00:11   

Connection to 192.168.1.20 closed.

[oracle@dg2 dave]$ ls

control01.ctl      stdbyredo01.log  stdbyredo03.log  sysaux01.dbf temp01.dbf     users01.dbf

control01.ctl.bak  stdbyredo02.log  stdbyredo04.log  system01.dbf undotbs01.dbf

[oracle@dg2 dave]$

 

 

[oracle@dg2 archivelog]$ cd/u01/app/oracle/fast_recovery_area/dave/

[oracle@dg2 dave]$ ls

control02.ctl

[oracle@dg2 dave]$ mv control02.ctlcontrol02.ctl.bak

[oracle@dg2 dave]$ ls

control02.ctl.bak

[oracle@dg2 dave]$

 

[oracle@dg2 dave]$ cp control01.ctl/u01/app/oracle/fast_recovery_area/dave/control02.ctl

 

2.8 在正常拉起备库

 

SQL> startup nomount;

ORACLE instance started.

 

Total System Global Area  814227456 bytes

Fixed Size                  2232760 bytes

Variable Size             478154312 bytes

Database Buffers          331350016 bytes

Redo Buffers                2490368 bytes

SQL> alter database mount standby database;

 

Database altered.

 

SQL> alter database recover managedstandby database disconnect from session;

 

Database altered.

 

SQL>

 

2.9 查看主备库日志

 

主库日志:

 

Fri Mar 29 08:00:51 2013

PING[ARC1]: Heartbeat failed to connect tostandby 'dave_st'. Error is 16143.

Fri Mar 29 08:01:52 2013

Error 1034 received logging on to thestandby

PING[ARC1]: Heartbeat failed to connect tostandby 'dave_st'. Error is 1034.

Fri Mar 29 08:02:56 2013

Error 1034 received logging on to thestandby

PING[ARC1]: Heartbeat failed to connect tostandby 'dave_st'. Error is 1034.

Fri Mar 29 08:03:57 2013

Error 1034 received logging on to thestandby

PING[ARC1]: Heartbeat failed to connect tostandby 'dave_st'. Error is 1034.

Fri Mar 29 08:04:59 2013

Error 1034 received logging on to the standby

PING[ARC1]: Heartbeat failed to connect tostandby 'dave_st'. Error is 1034.

Fri Mar 29 08:06:02 2013

Error 1034 received logging on to thestandby

PING[ARC1]: Heartbeat failed to connect tostandby 'dave_st'. Error is 1034.

Fri Mar 29 08:07:05 2013

Error 1034 received logging on to thestandby

PING[ARC1]: Heartbeat failed to connect tostandby 'dave_st'. Error is 1034.

Fri Mar 29 08:08:08 2013

PING[ARC1]: Heartbeat failed to connect tostandby 'dave_st'. Error is 16058.

Fri Mar 29 08:08:34 2013

ALTER SYSTEM SETlog_archive_dest_state_2='ENABLE' SCOPE=MEMORY SID='*';

Fri Mar 29 08:08:35 2013

Thread 1 advanced to log sequence 18 (LGWRswitch)

 Current log# 3 seq# 18 mem# 0: /u01/app/oracle/oradata/dave/redo03.log

Fri Mar 29 08:08:36 2013

******************************************************************

LGWR: Setting 'active'archival for destination LOG_ARCHIVE_DEST_2

******************************************************************

Fri Mar 29 08:08:36 2013

Archived Log entry 23added for thread 1 sequence 17 ID 0x3312f7c4 dest 1:

 

 

备库日志:

 

[oracle@dg2 trace]$ tail -20 alert_dave.log

ORA-27037: unable to obtain file status

Linux-x86_64 Error: 2: No such file ordirectory

Additional information: 3

Clearing online redo logfile 3 complete

Media Recovery Waiting for thread 1sequence 15

Fetching gap sequence in thread 1, gapsequence 15-16

Fri Mar 29 08:08:48 2013

RFS[3]: Assigned to RFS process 9707

RFS[3]: Opened log for thread 1 sequence 16dbid 856896964 branch 794014730

Fri Mar 29 08:08:49 2013

RFS[4]: Assigned to RFS process 9705

RFS[4]: Opened log for thread 1 sequence 15dbid 856896964 branch 794014730

Archived Log entry 2 added for thread 1sequence 16 rlc 794014730 ID 0x3312f7c4 dest 2:

Archived Log entry 3 added for thread 1sequence 15 rlc 794014730 ID 0x3312f7c4 dest 2:

Fri Mar 29 08:08:55 2013

Media Recovery Log/u01/archivelog/1_15_794014730.dbf

Media Recovery Log/u01/archivelog/1_16_794014730.dbf

Media Recovery Log/u01/archivelog/1_17_794014730.dbf

Fri Mar 29 08:09:11 2013

Media Recovery Waitingfor thread 1 sequence 18 (in transit)

 

注意这里:

    我们把备库拉起来之后,就自动开始同步了。

 

 

2.10 切换归档测试

 

主库:

SQL> alter system switch logfile;

 

System altered.

 

SQL> select sequence#,applied from v$archived_log order by sequence# desc;

 

 SEQUENCE# APPLIED

---------- ---------

       18 NO

       18 NO

       17 NO

       17 YES

       16 YES

       16 NO

       15 NO

       15 YES

       14 NO

       14 YES

       13 YES

       13 NO

       12 NO

       12 YES

       11 NO

       11 YES

       10 YES

       10 NO

        9 NO

        9 YES

        8 NO

        8 YES

        7 NO

        7 YES

        6 NO

        6 YES

        5 NO

        4 NO

 

28 rows selected.

 

 

备库:

 

SQL> select sequence#,applied from v$archived_log order by sequence# desc;

 SEQUENCE# APPLIED

---------- ---------

       18 YES

       17 YES

       16 YES

       15 YES

 

注意这里,备库已经完全同步了。 之前在我们主库看,18的日志还没有应用,因为我们刚启动备库,应用需要一定的时间。 并且在我们重新配置之后,这里的数字就从15开始了。 是我们中断DG后的数字。 但我们主库还是从4开始计算的。

   

 

小结:

    对于DG通信异常中断的处理,我们仅仅需要重新创建一份standby 的control file就可以了。

 

 

 

 

 

 

---------------------------------------------------------------------------------------

版权所有,文章允许转载,但必须以链接方式注明源地址,否则追究法律责任!

QQ:492913789

Email:ahdba@qq.com

Blog:  http://www.cndba.cn/dave

Weibo:    http://weibo.com/tianlesoftware

Twitter:  http://twitter.com/tianlesoftware

Facebook: http://www.facebook.com/tianlesoftware

Linkedin: http://cn.linkedin.com/in/tianlesoftware

这篇关于Oracle 11g Data Guard Error 16143 Heartbeat failed to connect to standby 处理方法的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/874508

相关文章

如何使用celery进行异步处理和定时任务(django)

《如何使用celery进行异步处理和定时任务(django)》文章介绍了Celery的基本概念、安装方法、如何使用Celery进行异步任务处理以及如何设置定时任务,通过Celery,可以在Web应用中... 目录一、celery的作用二、安装celery三、使用celery 异步执行任务四、使用celery

oracle数据库索引失效的问题及解决

《oracle数据库索引失效的问题及解决》本文总结了在Oracle数据库中索引失效的一些常见场景,包括使用isnull、isnotnull、!=、、、函数处理、like前置%查询以及范围索引和等值索引... 目录oracle数据库索引失效问题场景环境索引失效情况及验证结论一结论二结论三结论四结论五总结ora

JAVA中整型数组、字符串数组、整型数和字符串 的创建与转换的方法

《JAVA中整型数组、字符串数组、整型数和字符串的创建与转换的方法》本文介绍了Java中字符串、字符数组和整型数组的创建方法,以及它们之间的转换方法,还详细讲解了字符串中的一些常用方法,如index... 目录一、字符串、字符数组和整型数组的创建1、字符串的创建方法1.1 通过引用字符数组来创建字符串1.2

Java调用Python代码的几种方法小结

《Java调用Python代码的几种方法小结》Python语言有丰富的系统管理、数据处理、统计类软件包,因此从java应用中调用Python代码的需求很常见、实用,本文介绍几种方法从java调用Pyt... 目录引言Java core使用ProcessBuilder使用Java脚本引擎总结引言python

SpringBoot操作spark处理hdfs文件的操作方法

《SpringBoot操作spark处理hdfs文件的操作方法》本文介绍了如何使用SpringBoot操作Spark处理HDFS文件,包括导入依赖、配置Spark信息、编写Controller和Ser... 目录SpringBoot操作spark处理hdfs文件1、导入依赖2、配置spark信息3、cont

Apache Tomcat服务器版本号隐藏的几种方法

《ApacheTomcat服务器版本号隐藏的几种方法》本文主要介绍了ApacheTomcat服务器版本号隐藏的几种方法,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需... 目录1. 隐藏HTTP响应头中的Server信息编辑 server.XML 文件2. 修China编程改错误

Java中switch-case结构的使用方法举例详解

《Java中switch-case结构的使用方法举例详解》:本文主要介绍Java中switch-case结构使用的相关资料,switch-case结构是Java中处理多个分支条件的一种有效方式,它... 目录前言一、switch-case结构的基本语法二、使用示例三、注意事项四、总结前言对于Java初学者

使用Python实现大文件切片上传及断点续传的方法

《使用Python实现大文件切片上传及断点续传的方法》本文介绍了使用Python实现大文件切片上传及断点续传的方法,包括功能模块划分(获取上传文件接口状态、临时文件夹状态信息、切片上传、切片合并)、整... 目录概要整体架构流程技术细节获取上传文件状态接口获取临时文件夹状态信息接口切片上传功能文件合并功能小

Oracle Expdp按条件导出指定表数据的方法实例

《OracleExpdp按条件导出指定表数据的方法实例》:本文主要介绍Oracle的expdp数据泵方式导出特定机构和时间范围的数据,并通过parfile文件进行条件限制和配置,文中通过代码介绍... 目录1.场景描述 2.方案分析3.实验验证 3.1 parfile文件3.2 expdp命令导出4.总结

更改docker默认数据目录的方法步骤

《更改docker默认数据目录的方法步骤》本文主要介绍了更改docker默认数据目录的方法步骤,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的朋友们下面随着小编来一... 目录1.查看docker是否存在并停止该服务2.挂载镜像并安装rsync便于备份3.取消挂载备份和迁