遭遇ORA-1194,00354等错误,在线日志损坏,处理思路过程如下,
--告警报错信息 Thu Jul 24 00:16:08 2014 ORA-1194 signalled during: alter database open resetlogs... Thu Jul 24 00:16:23 2014 ALTER DATABASE RECOVER database until cancel Thu Jul 24 00:16:23 2014 Media Recovery Start parallel recovery started with 2 processes ORA-279 signalled during: ALTER DATABASE RECOVER database until cancel ... Thu Jul 24 00:18:35 2014 ALTER DATABASE RECOVER LOGFILE '/oradata/ora10g/lu10g/redo03.log' Thu Jul 24 00:18:35 2014 Media Recovery Log /oradata/ora10g/lu10g/redo03.log Thu Jul 24 00:18:36 2014 Errors in file /ora10g/admin/lu10g/udump/lu10g_ora_14994.trc: ORA-00354: corrupt redo log block header ORA-00353: log corruption near block 1508 change 4305448435 time 07/23/2014 23:56:57 ORA-00334: archived log: '/oradata/ora10g/lu10g/redo03.log' Recovery interrupted! Recovered data files to a consistent state at change 10480752
通过告警报错ORA-00354: corrupt redo log block header 从该错误可以看出当前日志的redo block的header出现问题,导致oracle无法读取redo进行恢复。通过以下检查发现数据库文件头都一致,没有需要恢复的文件,但是由于数据库是非正常关闭,last_change scn没有被写入,数据库判断需要恢复,则读取相应的redo log group 3,但是此时该文件已经损坏,尝试清理log,但是因为group 3在宕机时候为current状态,所以不能被清楚,则尝试通过设置_allow_reset_logs_corruption参数让系统跳过该错误,具体如下过程:
--检查文件状态 SQL> select file#,online_status,to_char(change#,'999999999999') from v$recover_file; no rows selected SQL> select file#,to_char(checkpoint_change#,'999999999999'), 2 to_char(last_change#,'999999999999') from v$datafile; FILE# TO_CHAR(CHECK TO_CHAR(LAST_ ---------- ------------- ------------- 1 10480749 2 10480749 3 10480749 4 10480749 SQL> / GROUP# THREAD# SEQUENCE# BYTES MEMBERS ARC STATUS FIRST_CHANGE# FIRST_TIM ---------- ---------- ---------- ---------- ---------- --- ---------------- ------------- --------- 1 1 31 52428800 1 YES INACTIVE 10391212 19-JUL-14 3 1 33 52428800 1 NO CURRENT 10480310 23-JUL-14 2 1 32 52428800 1 YES INACTIVE 10427919 20-JUL-14 SQL> alter database clear unarchived logfile group 3; alter database clear unarchived logfile group 3 * ERROR at line 1: ORA-01624: log 3 needed for crash recovery of instance lu10g (thread 1) ORA-00312: online log 3 thread 1: '/oradata/ora10g/lu10g/redo03.log' --在pfile中加入参数 _allow_resetlogs_corruption=true _allow_error_simulation=TRUE
重启数据库以resetlog方式打开,得到以下告警信息:
Thu Jul 24 00:45:45 2014 alter database open resetlogs RESETLOGS is being done without consistancy checks. This may result in a corrupted database. The database should be recreated. Thu Jul 24 00:45:45 2014 Errors in file /ora10g/admin/lu10g/udump/lu10g_ora_16481.trc: ORA-00265: instance recovery required, cannot set ARCHIVELOG mode Archive all online redo logfiles failed:265 RESETLOGS after incomplete recovery UNTIL CHANGE 10480752 Resetting resetlogs activation ID 2561579617 (0x98ae9a61) Thu Jul 24 00:45:47 2014 Setting recovery target incarnation to 3 Thu Jul 24 00:45:47 2014 Assigning activation ID 2580873586 (0x99d50172) LGWR: STARTING ARCH PROCESSES ARC0 started with pid=16, OS id=16490 Thu Jul 24 00:45:47 2014 ARC0: Archival started ARC1: Archival started LGWR: STARTING ARCH PROCESSES COMPLETE ARC1 started with pid=17, OS id=16492 Thu Jul 24 00:45:47 2014 Thread 1 opened at log sequence 1 Current log# 1 seq# 1 mem# 0: /oradata/ora10g/lu10g/redo01.log Successful open of redo thread 1 Thu Jul 24 00:45:47 2014 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Thu Jul 24 00:45:47 2014 ARC0: Becoming the 'no FAL' ARCH ARC0: Becoming the 'no SRL' ARCH Thu Jul 24 00:45:47 2014 ARC1: Becoming the heartbeat ARCH Thu Jul 24 00:45:47 2014 SMON: enabling cache recovery Thu Jul 24 00:45:47 2014 Errors in file /ora10g/admin/lu10g/udump/lu10g_ora_16481.trc: ORA-00600: internal error code, arguments: [2662], [0], [10480758], [0], [10480787], [4194313], [], [] Thu Jul 24 00:45:48 2014 Errors in file /ora10g/admin/lu10g/udump/lu10g_ora_16481.trc: ORA-00600: internal error code, arguments: [2662], [0], [10480758], [0], [10480787], [4194313], [], [] Thu Jul 24 00:45:48 2014 Error 600 happened during db open, shutting down database USER: terminating instance due to error 600 Instance terminated by USER, pid = 16481 ORA-1092 signalled during: alter database open resetlogs...
至此数据库2662错误已经出来了,那就好办了,2662的处理办法参考手工推进SCN的办法以及《谈谈adjust_scn的计算方法》