xifenfei 发表于 2014-3-15 20:43:33

记录一次ORA-00600 [kcrf_resilver_log_1] 恢复过程

运行环境数据库版本:11.2.0.1运行平台:Linux非归档无任何备份数据库启动报错Tue Mar 04 15:22:16 2014ALTER DATABASE OPENBeginning crash recovery of 1 threads parallel recovery started with 32 processesStarted redo scanErrors in file /opt/oracle/diag/rdbms/orcl/ORCL/trace/ORCL_ora_54101.trc(incident=16996):ORA-00600: internal error code, arguments: ,, , [], [], [], [], [], [], [], [], []Incident details in:/opt/oracle/diag/rdbms/orcl/ORCL/incident/incdir_16996/ORCL_ora_54101_i16996.trcTrace dumping is performingid=Aborting crash recovery due to error 600Errors in file/opt/oracle/diag/rdbms/orcl/ORCL/trace/ORCL_ora_54101.trc:ORA-00600: internal error code, arguments:, , , [], [], [], [], [], [], [], [], []Errors in file/opt/oracle/diag/rdbms/orcl/ORCL/trace/ORCL_ora_54101.trc:ORA-00600: internal error code, arguments:, , , [], [], [], [], [], [], [], [], []ORA-600 signalled during: ALTER DATABASEOPEN...
相关scn信息数据库scn信息
DBIDNAMEOPEN_MODECREATEDOPEN_MODELOG_MODECHECKPOINT_CHANGE#CTL_CHANGE#
1365059051ORCLMOUNTED2014-01-1917:18:35MOUNTEDNOARCHIVELOG4850306748510748
数据文件scn信息
TS#FILE#FILE_SIZE_GSTATUSENABLEDSCNSTOP_SCN
0110.576171875SYSTEMREAD WRITE48503067
12.751953125ONLINEREAD WRITE48503067
231.025390625ONLINEREAD WRITE48503067
44.0048828125ONLINEREAD WRITE48503067
6515ONLINEREAD WRITE48503067
数据文件头scn信息
TS#FILE#TABLESPACE_NAMESTATUSERRORFORMATRECFUZSCN
01SYSTEMONLINE 10NOYES48503067
12SYSAUXONLINE 10NOYES48503067
23UNDOTBS1ONLINE 10NOYES48503067
44USERSONLINE 10NOYES48503067
65ZNKKONLINE 10NOYES48503067
这里的数据库ORACLE 11.2.0.1因为Bug9056657导致数据库redobuffer未完整写入到redo log,从而导致数据库在启动过程中出现ORA-00600错误,而导致无法启动。 恢复过程ORA-00600 处理Tue Mar 04 15:22:16 2014ALTER DATABASE OPENBeginning crash recovery of 1 threads parallel recovery started with 32 processesStarted redo scanErrors in file/opt/oracle/diag/rdbms/orcl/ORCL/trace/ORCL_ora_54101.trc(incident=16996):ORA-00600: internal error code, arguments: , , , [], [], [],[], [], [], [], [], []这里可以看出来,数据库在open过程中需要读取redo log进行实例恢复,但是由于redo log因为bug异常,导致实例恢复无法完整。也就是说数据库无法完成完成实例恢复,也就是说依靠数据库自身无法完成实例恢复,需要人工干预强制拉起数据库 尝试_allow_resetlogs_corruption= TRUE在数据库open过程不验证一致性,强制拉起数据库,报错如下:Thu Mar 06 16:41:02 2014SMON: enabling cache recoveryErrors in file/opt/oracle/diag/rdbms/orcl/ORCL/trace/ORCL_ora_60342.trc(incident=18197):ORA-00600: internal error code, arguments:, , , , , , [], [], [], [], [], []Incident details in:/opt/oracle/diag/rdbms/orcl/ORCL/incident/incdir_18197/ORCL_ora_60342_i18197.trcErrors in file/opt/oracle/diag/rdbms/orcl/ORCL/trace/ORCL_ora_60342.trc:ORA-00600: internal errorcode, arguments: , , , , , , [], [],[], [], [], []Errors in file/opt/oracle/diag/rdbms/orcl/ORCL/trace/ORCL_ora_60342.trc:ORA-00600: internal error code, arguments:, , , , , , [], [], [], [], [], []Error 600 happened during db open, shuttingdown databaseUSER (ospid: 60342): terminating theinstance due to error 600Instance terminated by USER, pid = 60342ORA-1092 signalled during: alter databaseopen resetlogs... ORA-600恢复很常见的ORA-600错误,解决该问题的一般方法是推进scn,使用event 10015 推进scn,尝试open数据库,出现如下错误SMON: enabling tx recoveryDatabase Characterset is ZHS16GBKNo Resource Manager plan activeErrors in file/opt/oracle/diag/rdbms/orcl/ORCL/trace/ORCL_smon_60678.trc(incident=19357):ORA-00600: internal error code, arguments:, [], [], [], [], [], [], [], [], [], [], []Incident details in:/opt/oracle/diag/rdbms/orcl/ORCL/incident/incdir_19357/ORCL_smon_60678_i19357.trcErrors in file/opt/oracle/diag/rdbms/orcl/ORCL/trace/ORCL_ora_60694.trc(incident=19405):ORA-00600: internal error code, arguments:, [], [], [], [], [], [], [], [], [], [], []Incident details in:/opt/oracle/diag/rdbms/orcl/ORCL/incident/incdir_19405/ORCL_ora_60694_i19405.trcDoing block recovery for file 3 block 1443Resuming block recovery (PMON) for file 3block 1443Block recovery from logseq 2, block 64 toscn 1073742051Recovery of Online Redo Log: Thread 1 Group2 Seq 2 Reading mem 0 Mem# 0: /opt/oracle/oradata/orcl/redo02.logBlock recovery stopped at EOT rba 2.67.16Block recovery completed at rba 2.67.16,scn 0.1073742050Doing block recovery for file 3 block 128Resuming block recovery (PMON) for file 3block 128Block recovery from logseq 2, block 64 toscn 1073742047Recovery of Online Redo Log: Thread 1 Group2 Seq 2 Reading mem 0 Mem# 0: /opt/oracle/oradata/orcl/redo02.logBlock recovery completed at rba 2.65.16,scn 0.1073742049Errors in file/opt/oracle/diag/rdbms/orcl/ORCL/trace/ORCL_smon_60678.trc:ORA-01595: error freeingextent (3) of rollback segment (1))ORA-00600: internal errorcode, arguments: , [], [], [], [], [], [], [], [], [], [], [] ORA-600恢复因为11G的undo segment名称后面有时间戳,而且通过strings也不太好定位到准确名称,因此直接使用dul挖取数据文件获得名称,然后使用_corrupted_rollback_segments屏蔽,然后顺利打开数据库 在最后导出过程中还发现ORA-8013,然后通过plsql抽取该表正常数据,完成这次恢复 该库恢复过程使用了不少隐含参数和event,可能导致数据不一致,强烈建议通过逻辑方式重建库,保证数据安全稳定。
pdf版请见:记录一次ORA-00600 恢复过程

xifenfei 发表于 2014-3-15 20:46:11

朋友在win x64位上的ORACLE 11.2.0.1启动出现ORA-00600,让我帮忙看看,通过分析主要是因为Unpblished Bug 9056657导致
数据库启动报错
数据库在open的时候报ORA-00600
SQL> alter database open;
alter database open
*
第 1 行出现错误:
ORA-00600: 内部错误代码, 参数: , , , [],
[], [], [], [], [], [], [], []
alert日志报错
Sat Mar 01 18:40:44 2014
alter database open
Beginning crash recovery of 1 threads
parallel recovery started with 3 processes
Started redo scan
Errors in file f:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_6432.trc(incident=61360):
ORA-00600: 内部错误代码, 参数: , , , [], [], [], [], [], [], [], [], []
Incident details in: f:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_61360\orcl_ora_6432_i61360.trc
Aborting crash recovery due to error 600
Errors in file f:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_6432.trc:
ORA-00600: 内部错误代码, 参数: , , , [], [], [], [], [], [], [], [], []
Errors in file f:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_6432.trc:
ORA-00600: 内部错误代码, 参数: , , , [], [], [], [], [], [], [], [], []
ORA-600 signalled during: alter database open...
分先相关SCN
控制文件scn
http://www.xifenfei.com/wp-content/uploads/2014/03/1.jpg
控制文件中数据文件scn
http://www.xifenfei.com/wp-content/uploads/2014/03/2.jpg
数据文件头scn3http://www.xifenfei.com/wp-content/uploads/2014/03/3.jpg
通过这里可以知道,数据文件头的scn,控制文件中关于数据文件的scn都表明数据库为正常关闭,且scn值为16574746,但是控制文件中记录数据库SCN的值为16551515,可以判断数据库因为某种原因导致控制文件中的部分scn记录异常.

处理方法
因为控制文件SCN异常,考虑直接重建控制文件或者using backup controlfile方式恢复
SQL> select group#,status,sequence# from v$log;

    GROUP# STATUS            SEQUENCE#
---------- ---------------- ----------
         1 CURRENT                1510
         3 ACTIVE               1509
         2 ACTIVE               1508

    GROUP# MEMBER
---------- --------------------------------------------------
         3 F:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO03.LOG
         2 F:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO02.LOG
         1 F:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO01.LOG

SQL> recover database using backup controlfile until cancel;
ORA-00279: 更改 16574746 (在 03/01/2014 13:10:11 生成) 对于线程 1 是必需的
ORA-00289: 建议: F:\APP\ADMINISTRATOR\FLASH_RECOVERY_AREA\ORCL\ARCHIVELOG\2014_0
3_01\O1_MF_1_1510_%U_.ARC
ORA-00280: 更改 16574746 (用于线程 1) 在序列 #1510 中


指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
F:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO01.LOG
已应用的日志。
完成介质恢复。
SQL> alter database open resetlogs;

数据库已更改。

具体参考惜分飞blog:ORA-00600异常恢复

xifenfei 发表于 2014-3-15 20:46:54

两次相同的ORA-00600错误,但是完全不同的处理思路

travel.liu 发表于 2014-3-15 20:50:29

:)厉害!学习了

jeffreyli 发表于 2014-3-15 22:00:48

顶一下。

夜无伤 发表于 2014-3-15 22:15:01

学习了:lol
页: [1]
查看完整版本: 记录一次ORA-00600 [kcrf_resilver_log_1] 恢复过程