这个错误比较偏门与诡异,mos上是有个文章关于这个错误,但是只是说了bug,涉及版本号.
还有个文档说是和6061错误一起伴随,是索引的logical corruption伴随一起出现,而实际情况是否认的.
我的场景是9208 sun sparc 64bit平台,9月到现在出现该错误3次,伴随错误主要是2662和25012.期间怀疑过(trace文件中报错的SQL包含的对象)涉及的2张表和索引过大导致,这几个对象没做过碎片清理操作,而且其中有个表和索引已经超过300G,第一次和第二次前后不短的时间出现问题后将该表数据清理剩余500M,同时对表和索引进行重建,过3周后同样问题继续出现,mos也没有吻合的现象解释和bug,首先就是报错ktbair1错误,接着undo和内存中的数据出现不一致报错2662和25012,重启后smon自行修复完成.通过rman对database分别进行phy corruption和logical corruption 进行检查都没发现对应的错误,
我的报错信息:
Ora-600 errors reported in alert log file, Wed Dec 17 12:58:10 2014 Errors in file /oracle/admin/boss/udump/boss_ora_19696.trc: ORA-00600: internal error code, arguments: [ktbair1], [0], [6], [], [], [], [], [] Wed Dec 17 12:58:11 2014 Errors in file /oracle/admin/boss/udump/boss_ora_19696.trc: ORA-07445: exception encountered: core dump [0000000100638F9C] [SIGBUS] [Invalid address alignment] [0x203C2290D2BFFEF] [] [] ORA-00600: internal error code, arguments: [ktbair1], [0], [6], [], [], [], [], [] Wed Dec 17 12:59:01 2014 Errors in file /oracle/admin/boss/udump/boss_ora_6438.trc: ORA-00600: internal error code, arguments: [2662], [281], [2416071487], [513], [370935493], [3305242969], [], [] ORA-08103: object no longer exists Wed Dec 17 12:59:01 2014 Errors in file /oracle/admin/boss/udump/boss_ora_6438.trc: ORA-00603: ORACLE server session terminated by fatal error ORA-00600: internal error code, arguments: [2662], [281], [2416071487], [513], [370935493], [3305242969], [], [] ORA-08103: object no longer exists Wed Dec 17 12:59:09 2014 Errors in file /oracle/admin/boss/udump/boss_ora_17808.trc: ORA-00600: internal error code, arguments: [25012], [12], [606], [], [], [], [], [] . . Wed Dec 17 13:31:59 2014 Errors in file /oracle/admin/boss/udump/boss_ora_27048.trc: ORA-00600: internal error code, arguments: [2662], [281], [2422679772], [513], [370935493], [3305242969], [], [] Wed Dec 17 13:36:55 2014 Shutting down instance: further logons disabled Shutting down instance (immediate) License high water mark = 648 Wed Dec 17 13:41:56 2014 Shutting down instance (abort) License high water mark = 648 Instance terminated by USER, pid = 11139 Wed Dec 17 13:43:20 2014 Starting ORACLE instance (force) . . Completed: ALTER DATABASE MOUNT Wed Dec 17 13:43:57 2014 ALTER DATABASE OPEN Wed Dec 17 13:43:57 2014 Beginning crash recovery of 1 threads Wed Dec 17 13:43:57 2014 Started redo scan Wed Dec 17 13:43:57 2014 Completed redo scan 788 redo blocks read, 64 data blocks need recovery Wed Dec 17 13:43:57 2014 Started recovery at Thread 1: logseq 323774, block 803834, scn 0.0 Wed Dec 17 13:43:57 2014 Recovery of Online Redo Log: Thread 1 Group 6 Seq 323774 Reading mem 0 Mem# 0 errs 0: /tzboss_sysdata/boss/redo06.log Wed Dec 17 13:43:57 2014 Completed redo application Wed Dec 17 13:43:57 2014 Ended recovery at Thread 1: logseq 323774, block 804622, scn 281.2424682375 64 data blocks read, 64 data blocks written, 788 redo blocks read Crash recovery completed successfully From trace file ora-19696.trc, BH (0x58bfede00) file#: 64 rdba: 0x101a691a (64/1730842) class 1 ba: 0x58bd7c000 set: 80 dbwrid: 7 obj: 672167 objn: 66172 hash: [551f8af00,a67d5b010] lru: [52bfdfe68,5a4fcd768] ckptq: [NULL] fileq: [NULL] use: [a683b8eb8,a683b8eb8] wait: [NULL] st: CR md: EXCL rsop: 0x0 tch: 0 cr:[[scn: 0x0119.8e356168],[xid: 0x000a.02c.008b183e],[uba: 0x2dc978b1.2a93.04],[cls: 0x0119.900211c1],[sfl: 0x1]] flags: buffer_dirty mod_started only_sequential_access change state: ACTIVE buffer tsn: 15 rdba: 0x0203c229 (8/246313) scn: 0x06c5.072c0007 seq: 0x02 flg: 0x03 tail: 0x128c0600 frmt: 0x04 chkval: 0x1f46 type: 0xc1=unknown Hex dump of corrupt header 4 = CORRUPT
涉及的对象和文件:
SQL> col object_name format a30 SQL> select owner,object_name,object_type,subobject_name,status,temporary from dba_objects where data_object_id=672167; select file#,rfile#,name,ts# from v$datafile where rfile# in (64,8); no rows selected SQL> FILE# RFILE# NAME TS# ---------- ---------- -------------------------------------------------- ---------- 8 8 /tzboss_oradata/boss/ms_01.dbf 8 64 64 /tzboss_oradata/boss/bm_02.dbf 15
逻辑坏块验证结果:
Recovery Manager: Release 9.2.0.8.0 - 64bit Production Copyright (c) 1995, 2002, Oracle Corporation. All rights reserved. connected to target database: BOSS (DBID=1157181921) RMAN> run { 2> allocate channel d1 type disk; 3> allocate channel d2 type disk; 4> allocate channel d3 type disk; 5> allocate channel d4 type disk; 6> backup validate check logical database; 7> } 8> using target database controlfile instead of recovery catalog allocated channel: d1 channel d1: sid=240 devtype=DISK allocated channel: d2 channel d2: sid=21 devtype=DISK allocated channel: d3 channel d3: sid=277 devtype=DISK allocated channel: d4 channel d4: sid=179 devtype=DISK Starting backup at 05-NOV-14 channel d1: starting full datafile backupset channel d1: specifying datafile(s) in backupset input datafile fno=00009 name=/tzboss_oradata/boss/cs_01.dbf input datafile fno=00028 name=/tzboss_index/boss/cs_idx_01.dbf input datafile fno=00183 name=/tzboss_sysdata/boss/undotbs1_05.dbf .... input datafile fno=00226 name=/tzboss_oradata/boss/ei_05.dbf input datafile fno=00231 name=/tzboss_oradata/boss/tr_03.dbf input datafile fno=00243 name=/tzboss_oradata/boss/bp_18.dbf channel d2: backup set complete, elapsed time: 00:02:36 channel d2: starting full datafile backupset channel d2: specifying datafile(s) in backupset input datafile fno=00251 name=/tzboss_oradata/boss/pm_04.dbf input datafile fno=00252 name=/tzboss_oradata/boss/ei_06.dbf channel d4: backup set complete, elapsed time: 00:02:56 channel d1: backup set complete, elapsed time: 00:02:16 channel d2: backup set complete, elapsed time: 00:01:40 channel d3: backup set complete, elapsed time: 00:02:40 Finished backup at 06-NOV-14 released channel: d1 released channel: d2 released channel: d3 released channel: d4 Recovery Manager complete. SQL> select count(*) from v$database_block_corruption; COUNT(*) ---------- 0
oracle官方1级SR的诊断结果为
-- Search -- Note: This is INTERNAL ONLY research. No action should be taken by the customer on this information. This is research only, and may NOT be applicable to your specific situation. Searched on: Bug 6401576 : ORA-600 [KTBAIR1] / ORA-600 [KCBZPB_1] / CORRUPTION MESSAGES --> DB CRASH
但是实际上我这里从未出现过6101的错误,该bug按照如下的分析并不吻合我这个数据库所在的情况.但是按照解释这个bug也可以产生表示为ktbair1的错误,最终的流程都是先抛错误ORA-600 [KTBAIR1] / ORA-600 [KCBZPB_1] / CORRUPTION MESSAGES –> DB CRASH .进一步的核验正在进行,目前打算将报错涉及到的DBA地址所在的数据文件对象转移到其他表空间,然后看看是否能避开此类错误,如果还存在则对该版本打上6401576的补丁继续观察,与此错误的纠缠还没完.
Bug 6401576 : ORA-600 [KTBAIR1] / ORA-600 [KCBZPB_1] / CORRUPTION MESSAGES –> DB CRASH 描述:
Hdr: 6401576 9.2.0.8.0 RDBMS 9.2.0.8 RAM INDEX PRODID-5 PORTID-59 ORA-600 Abstract: ORA-600 [KTBAIR1] / ORA-600 [KCBZPB_1] / CORRUPTION MESSAGES --> DB CRASH *** 09/07/07 03:47 am *** *** 09/07/07 03:49 am *** *** 09/07/07 04:19 am *** (CHG: Sta->16) *** 09/07/07 04:19 am *** *** 09/07/07 04:32 am *** *** 09/07/07 04:33 am *** *** 09/07/07 07:30 am *** (CHG: Sta->10) *** 09/07/07 07:30 am *** *** 09/07/07 08:01 am *** *** 09/07/07 08:01 am *** *** 09/07/07 08:14 am *** *** 09/07/07 08:21 am *** ESCALATED *** 09/07/07 08:21 am *** *** 10/22/07 03:13 am *** (CHG: Sta->33 Asg->RDBMSREP SubComp->UNKNOWN) *** 10/22/07 03:13 am *** *** 10/22/07 03:13 am *** *** 10/26/07 01:05 am *** (CHG: Sta->16) *** 10/26/07 01:05 am *** *** 10/26/07 01:07 am *** *** 10/29/07 07:03 am *** *** 10/29/07 07:03 am *** *** 10/29/07 07:13 am *** *** 10/29/07 11:39 am *** (CHG: Sta->10) *** 10/29/07 11:39 am *** *** 10/30/07 03:19 am *** *** 10/31/07 02:32 am *** (CHG: Sta->16) *** 10/31/07 02:32 am *** *** 10/31/07 02:33 am *** *** 10/31/07 02:39 am *** *** 10/31/07 02:49 am *** *** 10/31/07 02:50 am *** *** 10/31/07 04:31 am *** (CHG: Sta->10) *** 10/31/07 04:31 am *** *** 10/31/07 07:21 am *** *** 10/31/07 07:21 am *** *** 11/02/07 06:40 am *** (CHG: Sta->16) *** 11/02/07 06:40 am *** *** 11/03/07 03:53 am *** (CHG: Sta->11 SubComp->RAM INDEX) *** 11/03/07 03:53 am *** *** 11/03/07 03:54 am *** *** 11/03/07 03:54 am *** *** 11/03/07 03:58 am *** *** 11/06/07 08:27 pm *** *** 11/06/07 08:27 pm *** *** 11/06/07 08:39 pm *** *** 11/06/07 08:39 pm *** (CHG: Sta->30) *** 11/06/07 08:39 pm *** *** 11/06/07 11:55 pm *** *** 11/08/07 03:36 am *** (CHG: Sta->11) *** 11/08/07 03:36 am *** *** 11/08/07 06:51 am *** *** 11/12/07 06:30 pm *** (CHG: Sta->30) *** 11/12/07 06:30 pm *** *** 11/14/07 03:38 am *** (CHG: Sta->11) *** 11/14/07 03:38 am *** *** 11/14/07 03:53 am *** *** 11/14/07 03:28 pm *** *** 11/21/07 04:35 pm *** (CHG: Sta->30) *** 11/21/07 04:35 pm *** *** 11/26/07 07:05 pm *** *** 12/05/07 06:23 am *** (CHG: Sta->11) *** 12/05/07 06:23 am *** *** 12/14/07 06:04 pm *** (CHG: Sta->30) *** 12/14/07 06:04 pm *** *** 12/18/07 03:58 am *** (CHG: Sta->11) *** 12/18/07 03:58 am *** *** 12/19/07 03:42 am *** *** 12/19/07 08:23 am *** *** 12/20/07 11:35 am *** *** 01/03/08 03:14 pm *** *** 01/07/08 06:56 am *** *** 01/07/08 08:36 am *** *** 01/11/08 03:19 pm *** *** 01/15/08 11:55 am *** *** 01/16/08 08:10 am *** (ADD: Impact/Symptom->DATA CORRUPTION ) *** 01/16/08 08:11 am *** (ADD: Impact/Symptom->MEMORY CORRUPTION ) *** 01/16/08 08:12 am *** *** 01/16/08 08:12 am *** (CHG: Sta->80) *** 01/16/08 08:22 am *** *** 01/16/08 08:23 am *** Rediscovery Information: If ORA-600[6101] is encountered during the compaction of a compressed index block, then it is this bug. There can also be other internal errors such as ktbair1, kcbzpb_1, 4519. Workaround: none Release Notes: ]] Internal errors could occur during the compaction of a compressed ]] index block.
通过基于rowid的表全扫,也未遭遇报错!
oracle恢复无法确认的错误,即使立为新bug也无法开发补丁.