Skip to content

异常恢复

从坏块中恢复未损坏的数据

前面介绍通过skip坏块(event或者dbms包)和构造rowid的方式,以及通过salvage脚本从存在坏块的表中恢复数据,同时也可以通过mdata恢复工具来unload出表中的数据,这里补充一个场景,就是坏块中未损坏的数据的恢复。

从已经损坏的块中尽可能的恢复数据的思路依旧参考的是通过rowid的方式来实现恢复,这里讨论oracle 8i以及8i以上版本的数据库.

 

rowid基本结构这里不做详细解释,前面有介绍过,这下面是官方的介绍。

       OOOOOO = is a base 64 encoding of the 32-bit dataobj# (Data object
                number was introduced in 8.0 to track versions of the same
                segment because certain operations can change the version.
                It is used to discover stale ROWIDs and stale undo records)
          FFF = is a base 64 encoding of the relative file number
       BBBBBB = is a base 64 encoding of the block number
          SSS = is a base 64 encoding of the slot (row) number

   Note that the ROWID contains the relative file number which is distinct from
   the absolute file number and new for Oracle8.  A relative file number is
   relative to the tablespace (meaning a tablespace can have a first, second,
   third file, etc.) and an absolute file number is absolute in the whole
   system.  Two different files may have the same relative number.

   例如 "SELECT ACCT_NO, ROWID from EXAMPLE;" 获得的如下结果:

   ACCT_NO    ROWID
   ---------- ------------------
        12345 AAAAh3AAGAAACJAAAA
        19283 AAAAh3AAGAAACJAAAB
        22345 AAAAh4AAFAAAAADAAA
        60372 AAAAh4AAFAAAAADAAB

     dbms_rowid构造rowid的函数:

              ROWID_CREATE(rowid_type    IN number,
                           object_number IN number,
                           relative_fno  IN number,
                           block_number  IN number,
                           row_number    IN number)
              return ROWID;

     -- rowid_type      - type (restricted=0/extended=1)
     -- object_number   - data object number
     -- relative_fno    - relative file number
     -- block_number    - block number in this file
     -- row_number      - row number in this block



    使用高低位的rowid来跳过坏块使用基于rowid的方式扫描出相关rows,这和上篇介绍的构造的rowid方式是一样的。

     The "LOW_RID" is the lowest rowid INSIDE the corrupt block:

       SELECT dbms_rowid.rowid_create(1,<DATA_OBJECT_ID>,<RFN>,<BL>,0) LOW_RID
	 from DUAL;

     The "HI_RID" is the first rowid AFTER the corrupt block:

       SELECT dbms_rowid.rowid_create(1,<DATA_OBJECT_ID>,<RFN>,<BL>+1,0) HI_RID
	 from DUAL;


	CREATE TABLE salvage_table AS
	 SELECT /*+ ROWID(A) */ * FROM <owner.tablename> A
	  WHERE rowid < '<low_rid>'
	;

	INSERT INTO salvage_table 
	 SELECT /*+ ROWID(A) */ * FROM <owner.tablename> A
	  WHERE rowid >= '<hi_rid>'
	;

    如果是分区表,则通过针对存在坏块的分区的方式即可

	CREATE TABLE salvage_table AS
	 SELECT /*+ ROWID(A) */ * 
	   FROM <owner.tablename> PARTITION (<partition_name>) A
	  WHERE rowid < '<lo_rid>'
	;

	INSERT INTO salvage_table 
	 SELECT /*+ ROWID(A) */ * 
	   FROM <owner.tablename> PARTITION (<partition_name>) A
	  WHERE rowid >= '<hi_rid>'
	;

   如果表中带有long字段需要使用exp/imp的方式来进行相关数据的导入导出,同时要带有where条件来帅选出rowid的范围。
   剩下就是要讨论的从损坏的块中恢复数据,当然如果坏块是段头那么就没必要从段头中恢复rows,可以通过如下的sql判断损坏的块是否表段头。
 
   select file_id,block_id,blocks,extent_id 
   from dba_extents 
   where owner='<owner>' 
     and segment_name='<table_name>' 
     and segment_type='TABLE'
   order by extent_id;

     FILE_ID  BLOCK_ID    BLOCKS EXTENT_ID
   --------- --------- --------- ---------
           8     94854     20780         0 <- EXTENT_ID 为0则代表坏块为段头


  Finding out about data in the Corrupt Block
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  构造rowid的方式根据块号可以构造出坏块里面row地址,前面有说明,如果表中存在索引,则可以尝试通过索引来判断坏块中row的信息,可以结合上面提到的坏块上下块的高低位rowid来扫描出坏块里的rows信息,具体如下:
 
  如果索引所在字段要求为非空,则可以通过以下sql的索引快速全扫描来实现坏块rowid的获取: 
	SELECT /*+ INDEX_FFS(X <index_name>) */ 
		<index_column1>, <index_column2> ...
	  FROM <tablename> X
	 WHERE rowid >= '<low_rid>'
	   AND rowid <  '<hi_rid>'
	;

如果索引所在的列允许null,则无法使用索引快速全扫描并且必须使用范围扫描。这需要知道确保领先索引列的最小可能值,具体sql如下:

	SELECT /*+ INDEX(X <index_name>) */ 
		<index_column1>, <index_column2> ...
	  FROM <tablename> X
	 WHERE rowid >= '<low_rid>'
	   AND rowid <  '<hi_rid>'
	   AND <index_column1> >= <min_col1_value>
	;

 

8i以上版本的恢复案例:

SQL> select * from scott.partitionexample;

	ORA-01578: ORACLE data block corrupted (file # 7, block # 12698)
	ORA-01110: data file 7: '/oracle1/oradata/V816/oradata/V816/users01.dbf'

    >>  <RFN> = 7 , <BL> = 12698 , <AFN> = 7

	SQL> SELECT tablespace_name, segment_type, owner, segment_name
               FROM dba_extents
              WHERE file_id =7
		AND 12698 between block_id AND block_id + blocks - 1 ;

        TABLESPACE_NAME  SEGMENT_TYPE       OWNER   SEGMENT_NAME
        ---------------  ------------       -----   ------------
        USERS            TABLE PARTITION    SCOTT   PARTITIONEXAMPLE

        SQL> SELECT partition_name FROM dba_extents
              WHERE file_id =7
	        AND 12698 between block_id AND block _id + blocks - 1;

	PARTITION_NAME
	------------------------------
	PARTEX2

        SQL> SELECT data_object_id
	       FROM dba_objects
	      WHERE object_name = 'PARTITIONEXAMPLE'  and owner='SCOTT'
	        AND subobject_name= 'PARTEX2';

	DATA_OBJECT_ID
 	---------------
     	88145
       
        找出高低位的rowid并扫描相关数据

	SQL> select dbms_rowid.rowid_create(1, 88145,7,12698,0) from dual;

	DBMS_ROWID.ROWID_C
	------------------
	AAAVhRAAHAAADGaAAA

	SQL>  select dbms_rowid.rowid_create(1, 88145,7,12699,0) from dual;

	DBMS_ROWID.ROWID_C
	------------------
	AAAVhRAAHAAADGbAAA

	SQL> SELECT /*+ ROWID(A) */ * 
	       FROM scott.partitionexample A
     	      WHERE rowid < 'AAAVhRAAHAAADGaAAA';

	COLUMN1    COLUMN2
	---------- ----------
	        15 a
	       ...

	SQL> SELECT /*+ INDEX_FFS(A PARTEXAM) */  column1
	       FROM  scott.partitionexample A
	      WHERE  rowid >= 'AAAVhRAAHAAADGaAAA'
		AND  rowid <  'AAAVhRAAHAAADGbAAA' ;
	
	COLUMN1
	----------
	        25
	...

 

 


 

 

REM NAME: salvage.sql
REM NOTE: This program will not handle long, long raw colums or chained rows.
SET ECHO OFF;  
SET SERVEROUTPUT ON;  
  
CREATE OR REPLACE FUNCTION dectohex(a IN NUMBER) RETURN VARCHAR2 IS  
      x VARCHAR2(8) := '';  
        y VARCHAR2(1);  
        z NUMBER;  
        w NUMBER;  
    BEGIN  
        IF a > POWER(2,32) OR a < 0  THEN  
            RAISE invalid_number;  
        END IF;  
        w := a;  
        WHILE w > 0 LOOP  
 z := w MOD 16;  
        IF z = 10 THEN y := 'A';  
        ELSIF z = 11 THEN y := 'B';  
        ELSIF z = 12 THEN y := 'C';  
        ELSIF z = 13 THEN y := 'D';  
     ELSIF z = 14 THEN y := 'E';  
        ELSIF z = 15 THEN y := 'F';  
        ELSE y := TO_CHAR(z);  
        END IF;  
            w := TRUNC(w / 16);  
            x := CONCAT(y,x);   -- build x string backwards  
        END LOOP;  
        RETURN x;  
    END;  
/  
  
  
DECLARE  
  
        table_name         VARCHAR2(30):= 'dept';  
   user_name          VARCHAR2(30):= 'scott';      
  
        file_id            NUMBER;  
        block_id           NUMBER;  
        extentid           number;  
    blocks             NUMBER;  
        file_id_hex        VARCHAR2(4);  
        block_id_hex       VARCHAR2(8);  
        row_count_hex      VARCHAR2(4);  
        row_count          NUMBER;  
        block_counter      NUMBER;  
        for_loop_counter   NUMBER;  
        corrupt_block_flag NUMBER:=0;  
        check_file NUMBER;  
        check_block        NUMBER;  
        row_id_hex         CHAR(18);  
        invalid_rowid      EXCEPTION;  
        rows_per_block     NUMBER:=100;  
  
    CURSOR c1 IS SELECT file_id, block_id, blocks,extent_id  FROM dba_extents  
       WHERE SEGMENT_NAME = UPPER(table_name) AND OWNER = UPPER(user_name)  
    ORDER BY EXTENT_ID;  
            
    CURSOR c2 IS SELECT file_id,block_id FROM corrupt_block;  
      
    PROCEDURE ins (v_rowid VARCHAR2) is  
      bad_rowid  EXCEPTION;  
      PRAGMA EXCEPTION_INIT (bad_rowid, -01410);  
      BEGIN  
   INSERT INTO salvaged_rows_table  
                SELECT *  
                FROM SCOTT.DEPT  
                WHERE rowid = v_rowid;  
  
      EXCEPTION  
         WHEN bad_rowid THEN  
         NULL;  
    END ins;  
  
BEGIN  
  OPEN c1;  
  LOOP   
    FETCH c1 INTO file_id, block_id, blocks,extentid;  
    EXIT WHEN c1%NOTFOUND;  
    dbms_output.put_line('extent: file:'||to_char(file_id)||'  
block_id:'||to_char(block_id)||' blocks:'||to_char(blocks));  
    file_id_hex := DECTOHEX(file_id);  
    block_counter:= block_id;  
    for_loop_counter:= block_id + blocks-1;  
    FOR i IN block_id..for_loop_counter LOOP  
      corrupt_block_flag:= 0;  
      OPEN c2;  
        LOOP  
          FETCH c2 INTO check_file, check_block;  
          EXIT WHEN c2%NOTFOUND;  
          IF (check_block = block_counter AND check_file = file_id) THEN  
             corrupt_block_flag :=1;  
               dbms_output.put_line('....Skipping corrupt  
block:'||to_char(block_counter)|| ' file id:'||to_char(file_id));  
          END IF;  
        END LOOP;    
      CLOSE c2;  
      block_id_hex := dectohex(block_counter);  
      IF corrupt_block_flag = 0 THEN  
        FOR row_count IN 0..rows_per_block LOOP  
            IF row_count = 0 THEN  
               row_count_hex := '0';  
            ELSE  
               row_count_hex:=dectohex(row_count);    
      END IF;  
            row_id_hex:=LPAD(block_id_hex,8,0) ||'.'||  
    LPAD(row_count_hex,4,0) ||'.'|| LPAD(file_id_hex,4,0);  
            ins(row_id_hex);  
        END LOOP;  
      END IF;  
      block_counter := block_counter+1;  
      COMMIT;  
    END LOOP;       
  END LOOP;  
CLOSE c1;  
COMMIT;  
END;  
/
SALVAGE.SQL – PL/SQL Code to Extract Data from a Corrupt Table

NOLOGGING 操作引起的坏块 – 错误解释和解决方案

如果数据段定义为 NOLOGGING 属性,当 NOLOGGING/UNRECOVERABLE 操作修改该数据段或者使用datapump import 参数 disable_archive_logging:y,联机重做日志只记录很少的日志信息,如果之后执行 RECOVERY 操作的话,会导致这些块无效。

如果这些联机重做日志/归档日志被用来恢复数据文件,那么 Oracle 会将对应的数据块标志为无效,而且下一次访问这些数据块时,会报 ORA-1578 和 ORA-26040错误。

例如:

SQL> select * from test_nologging;

ORA-01578: ORACLE data block corrupted (file # 11, block # 84)
ORA-01110: data file 4: ‘/oradata/users.dbf’
ORA-26040: Data block was loaded using the NOLOGGING option

以下数据字典视图中的 LOGGING 列记录了 NOLOGGING 属性:

DBA_TABLES, DBA_INDEXES, DBA_LOBS, DBA_TAB_PARTITIONS, DBA_LOB_PARTITIONS, DBA_TAB_SUBPARTITIONS, 等等。

LOGGING=’NO’ 表示 NOLOGGING。

接下来,这些数据块会被标志为 Soft Corrupt,当下一次访问该数据块时,会报 ORA-1578 和 ORA-26040错误。

DATAPUMP 参数 DISABLE_ARCHIVE_LOGGING

DATAPUMP impdp 参数 DISABLE_ARCHIVE_LOGGING:Y 在import时禁止 LOGGING 定义,会产生NOLOGGING操作; 如果相应的datafile被restored 和 recovered, 那么接下来的语句会报错 ORA-1578 和 ORA-26040.

“如果database是 FORCE LOGGING 模式, 那么DISABLE_ARCHIVE_LOGGING 选项不会关闭logging.

import时使用这个参数的例子:

 

impdp scott/tiger directory=DATA_PUMP_DIR dumpfile=dp transform=disable_archive_logging:y

 

DBV 检测坏块时,如果 RDBMS 版本小于 10.2.0.4,那么 DBV 打印错误 DBV-200,如果 RDBMS 版本大于或等于 10.2.0.4,那么 DBV 打印错误 DBV-201  ( Note  5031712.8 ):

DBV-00200: Block, dba 46137428, already marked corrupted
DBV-00201: Block, DBA 46137428, marked corrupt for invalid redo application

“VALIDATE” RMAN 命令用来检测NOLOGGING数据块,检查结果记录在 view v$database_block_corruption (versions lower than 12c) 和 v$nonlogged_block (12c and greater).  下面的例子中检查出datafile 4 有 933 坏块,查询v$database_block_corruption 或者 v$nonlogged_block。

RMAN> VALIDATE DATABASE;

…..
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
—- —— ————– ———— ————— ———-
4    OK     933            1            6401            2275124
File Name: /oracle/dbs/users.dbf

RMAN 检测坏块时,如果 RDBMS 版本小于 10.2.0.5 和 11.1.0.7,RMAN 打印如下错误:

10.2.0.4 and lower, 11.1.0.6, 11.1.0.7:

RMAN validate reports it in v$database_block_corruption with CORRUPTION_TYPE=LOGICAL

如果 RDBMS 版本大于或等于 10.2.0.5 和 11.2.0.1,RMAN 报告:查看视图 v$database_block_corruption 中 CORRUPTION_TYPE=NOLOGGING 的记录。参考 Note 7396077.8 :

10.2.0.5 and 11.2.0.1+:

RMAN validate reports it in v$database_block_corruption with CORRUPTION_TYPE=NOLOGGING

 

在12c及以后版本中,RMAN validate的结果不在视图v$database_block_corruption中,而是在视图v$nonlogged_block:

12c:

RMAN validate的结果显示在视图v$nonlogged_block

 

在12.2 版本, 可以使用新的命令:”validate .. nonlogged block” 去验证nologging的block。 在以下的例子中,数据文件5,6有nologged的block:

RMAN> validate database nonlogged block;

Starting validate at …
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=133 device type=DISK
channel ORA_DISK_1: starting validation of datafile
channel ORA_DISK_1: validation complete, elapsed time: 00:00:35

List of Datafiles
=================
File Status Nonlogged Blocks Blocks Examined Blocks Skipped
—- —— —————- ————— ————–
1 OK 0 106363 0
2 OK 0 78919 0
3 OK 0 96639 0
4 OK 0 4991 0
5 OK 400 2559 0
6 OK 569 2559 0

Details of nonlogged blocks can be queried from v$nonlogged_block view

Alert log中会更新以下信息:

Started Nonlogged Block Replacement recovery(validate) on file 5 (ospid 26351 rcvid 10616970560844821494)
Finished Nonlogged Block Replacement recovery(validate) on file 5. 400 blocks found

Started Nonlogged Block Replacement recovery(validate) on file 6 (ospid 26351 rcvid 10616970560844821494)
Finished Nonlogged Block Replacement recovery(validate) on file 6. 569 blocks found

NOLOGGING 导致的坏块不会导致 RMAN 备份失败,一般来说 soft corrupt block 不会导致 RMAN 备份失败,不需要设置 MAXCORRUPT。数据库备份中就会含有 soft corrupt block,如果使用这些备份恢复数据,那么恢复的数据也含有 soft corrupt block。

除 ORA-26040 错误之外,当还有一些其他通用信息出现时,block dump 可 能会被产生,如果数据块的 block dump 内有 byte 0xff 信息或者属于某个段,ORA-1578/ORA-26040会因为介质恢复了NOLOGGING的部分导致了corruption而出现。

 

监控 NOLOGGING 操作

执行NOLOGGING操作,并且之后没有备份的情况下,RMAN 命令 “REPORT UNRECOVERABLE” 可以查询出被影响的datafile。

RMAN> report unrecoverable;

using target database control file instead of recovery catalog
Report of files that need backup due to unrecoverable operations
File Type of Backup Required Name
—- ———————– ———————————–
4    full or incremental     /oracle/dbs/users.dbf

当初始化参数db_unrecoverable_scn_tracking设置为true (默认值),那么V$DATAFILE 中以下列会被更新;
db_unrecoverable_scn_tracking参数在10g中是不可用的。请参考Oracle Database在线文档中关于V$DATAFILE中以下列信息:

UNRECOVERABLE_CHANGE#
UNRECOVERABLE_TIME
FIRST_NONLOGGED_SCN
FIRST_NONLOGGED_TIME

在11.2.0.4 or 12.1.0.2+版本中,设置event 16490的情况下,物理备库的MRP进程会检查出NOLOGGING变化,并记录在alert log。

ORA-16490 “logging invalidated blocks on standby due to invalidation redo”

“INVD_BLKS: Invalidating (file <file number>, bno <block number>)”
“fname: ‘Datafile name’. rdba: …”

识别数据块什么时候被标志为NOLOGGING

识别数据块什么时候被标志为NOLOGGING,可以将trace文件中数据块SCN或者v$database_block_coruption视图中CORRUPTION_CHANGE#值
转换为时间:

使用trace文件中数据块SCN
例如:

  Start dump data blocks tsn: 60 file#: 4 minblk 84 maxblk 84
buffer tsn: 3 rdba: 0x02c00054 (11/84)
scn: 0x0771.4fa24eb5 seq: 0xff flg: 0x04 tail: 0x4eb500ff

提取SCN值 0x0771.4fa24eb5, 删除 ‘.’ ,然后转换 0x07714fa24eb 到十进制 511453045995

 

使用v$database_block_coruption视图中CORRUPTION_CHANGE#值

如果运行RMAN validate命令后,v$database_block_coruption视图中corruption_type=’NOLOGGING’ (10.2.0.5 和 11.2.0.1+),
那么CORRUPTION_CHANGE#列的值就是十进制的SCN值。

获得SCN Timestamp

使用下面的方法获得时间:

  

select scn_to_timestamp(&&decimal_scn)
from dual;

如果运行RMAN validate:

select file#, block#, scn_to_timestamp(CORRUPTION_CHANGE#)
from v$database_block_corruption
where CORRUPTION_TYPE=’NOLOGGING’;

In 12c:

select file#, block#, scn_to_timestamp(NONLOGGED_START_CHANGE#)
from v$nonlogged_block;

 

如果查询gv$archived_log 或 gv$log_history遇到错误ORA-08181:

alter session set nls_date_format = ‘DD-MON-YY HH24:MI:SS’;

select first_time, next_time
from gv$archived_log
where &decimal_scn between first_change# and next_change#;

select first_time
from gv$log_history
where &decimal_scn between first_change# and next_change#;

如果运行RMAN validate:

alter session set nls_date_format = ‘DD-MON-YY HH24:MI:SS’;

select file#, block#, first_time, next_time
from v$archived_log, v$database_block_corruption
where CORRUPTION_CHANGE# between first_change# and next_change#
and CORRUPTION_TYPE=’NOLOGGING’;

select file#,block#,first_time
from   v$log_history, v$database_block_corruption
where  CORRUPTION_CHANGE# between first_change# and next_change#
and CORRUPTION_TYPE=’NOLOGGING’;

12c:

alter session set nls_date_format = ‘DD-MON-YY HH24:MI:SS’;

select file#, block#, first_time, next_time
from v$nonlogged_block, v$archived_log
where NONLOGGED_START_CHANGE# between first_change# and next_change#;

OR

select file#, block#, first_time
from v$nonlogged_block, v$log_history
where NONLOGGED_START_CHANGE# between first_change# and next_change#;

 

SYSAUX表空间/AWR,EM 等出现NOARCHIVELOG 和 NOLOGGING 问题

 

如果数据库版本是11.1.0.6 或 11.1.0.7 或 11.2.0.1,
对NOLOGGING对象执行过DIRECT PATH操作,并且后续执行了RECOVER DATABASE命令,及时数据库FORCE LOGGING是打开的情况下,
会出现ORA-1578 and ORA-26040错误。

 

这种问题经常发生在SYSAUX表空间中的AWR或EM对象

 

解决方法

NOLOGGING 操作引起的坏块是不能修复的,比如”Media Recovery” 或 “RMAN blockrecover”都无法修复这种坏块。可行的方法是在 NOLOGGING 操作之后立刻备份对应的数据文件。

问题是在执行RMAN DUPLICATE或RESTORE之后产生?

如果问题是执行RMAN DUPLICATE 或 RESTORE之后 ,那么在源库打开FORCE LOGGING,然后再重新运行RMAN DUPLICATE 或 RESTORE。

  • alter database force logging;

问题是发生在物理standby库?

  • 如果错误出现在物理 STANDBY 数据库, 从主库恢复被影响的数据文件 (只有当主库没有这个问题的情况下)。 在12c中可以使用RMAN选项RECOVER NONLOGGED BLOCK with DATAFILE,TABLESPACE,DATABASE。例如:
  •   
    RMAN> RECOVER DATABASE NONLOGGED BLOCK;
  • 为了避免这个问题发生,在主库强制生产日志:
  • alter database force logging;

如果同一个datafile的数据块在主库出现nologging 坏块,但是备库没有,可以通过手动跳过(dbms_repair)坏块 或者设置event 10231.
主库出现nologging 坏块可能是由于主库执行过备份恢复或者之前是备库,执行了switchover

识别被影响的Segment

找到坏块所在的对象:

  • 如果NOLOGGING数据块位于空闲数据块(dba_free_space视图可以查询到),DBVerify检查会发现这个问题,报错DBV-00201
    或者在v$database_block_corruption视图中显示.对于这种情况,我们可以等待到这个数据块被重用时,会自动格式化,或者
    手动强制格式化
  • 如果是索引,重新创建(drop/create)索引。
  • 如果是,使用 procedure DBMS_REPAIR.SKIP_CORRUPT_BLOCKS 跳过坏块。然后考虑是否重建表:移动table: alter table &table_name move;或者保存数据 (export, Create Table as Select, etc) 然后truncate 或 drop/create.
  • 如果是LOB 另行处理

如果删除有坏块的段之后,这个坏块就处于空闲状态,后续可以被分配给其他对象/段,当这个坏块被分配给他对象/段时,
这个数据块被重新格式化。 如果v$database_block_corruption视图中还是显示为坏块,那么手动运行rman validate来清除视图中的信息。

OBJ$表上索引有坏块导致部分业务失效的修复

这种问题的解决办法有好几种,根据数据量的大小以及业务是否允许停机的情况考虑,主要分以下几类救急办法:

一、坏块在索引上的位置判断

1.1如果坏块不多,而且坏块都在在索引比较靠前的位置(这个靠前的位置每个版本不一样,每个版本的对象数不一样),可以理解为在建库时候obj$自带索引的那个块最大地址就是属于考前的位置,当坏块在这个位置前面时候有一种比较便捷的处理方式,

就是通过创建一个一模一样的数据库,使用bbed copy(相比DD操作,操作起来超级简单又不容易出错)把对应坏块位置的块copy覆盖损坏的块,apply后即可(如果是在obj$表中的块那么记得修改下obj里对应的create time即可使数据库恢复正常,这个在前面有个案例描述过)。

 

1.2如果坏块是在索引靠后的位置或者坏块很多,索引的对象为后面创建的对象,那么这个时候就比较麻烦了,好在obj$的数据是完整的,

我的建议是采用2种方法,

1.2.1  采用屏蔽obj$索引的方法,具体需要修改基表,这个方法在前面文章中有描述。这个情况一般用在数据库比较大,比如数据量有几十个T,上百T或者业务不能接受长时间停机的情况

1.2.2 当数据库的数据量不大的时候,可以采用另外一种办法,使用exp来导出数据进行修复,一般这个时候exp工具是无法直接使用的,需要使用到catexp.sql这个脚本,具体的方法请查阅博客里的文章 Corruptions on OBJ$ indexes

 

Oracle bootstrap$ 说明

What is bootstrap?

Bootstrap is a technique for loading the first few instructions of a computer program into active memory and then using them to bring in the rest of the program.

What is bootstrap in Oracle ?

In Oracle, Bootstrap refers to loading of metadata (data dictionary) before we OPEN the database.Bootstrap objects are classified as the objects (tables / indexes / clusters) with the object_id below 56 as bootstrap objects.  These objects are mandatory to bring up an instance, as this contains the most important metadata of the database.

What happens on database startup?

This shall be explained by setting the SQL_TRACE while opening the database.Connect as sysdba and do the following
SQL> startup mount ;
SQL> alter session set events ‘10046 trace name context forever, level 12 ‘ ;
SQL> alter database open ;
SQL>  alter session set events ‘10046 trace name context off ‘ ;
SQL> ORADEBUG SETMYPID
SQL> ORADEBUG TRACEFILE_NAME
The sql_trace of the above process explains the following operations behind startup. The bootstrap operation happens between MOUNT stage and OPEN stage.
1.)  The first SQL after in the above trace shows the creation of the bootstrap$ table. Something similar to the following:
create table bootstrap$ ( line# number not null, obj# number not null, sql_text varchar2(4000) not null) storage (initial 50K objno 56 extents (file 1 block 377))
This sys.bootstrap$ table contains the DDL’s for other bootstrap tables (object_id below 56). Actually these tables were created internally by the time of database creation (by sql.bsq), The create DDL passed between MOUNT and OPEN stage will be executed through different driver routines. In simple words these are not standard CREATE DDLs.
While starting up the database oracle will load these objects into memory (shared_pool), (ie) it will assign the relevant object number and refer to the datafile and the block associated with that. And such operations happen only while warm startup.
 The internals of the above explained in ‘kqlb.c’.
2.)  Now a query executed against the sys.bootstrap$ table, which holds the create sql’s for other base tables.
select line#, sql_text from bootstrap$ where obj# != :1 (56)
Subsequently it will create those objects by running those queries.
Object number 0 – (System Rollback Segment)
Object number 2 to 55 (Other base tables)
Object number 1 is NOT used by any of the objects.
3.) Performs various operations to keep the bootstrap objects in consistent state.
Upon the successful completion of bootstrap the database will do the other tasks like recovery and will open the database.

Which objects are classified as bootstrap objects in oracle database?

Objects with data_object_id less than 56 are classified as core bootstrap objects.The objects are added to the bootstrap. The objects affected are :

hist_head$
histgrm$
i_hh_obj#_col#
i_hh_obj#_intcol#
i_obj#_intcol#
i_h_obj#_col#
c_obj#_intcol#
From 10.1 the following objects have been added:
fixed_obj$
tab_stats$
ind_stats$
i_fixed_obj$_obj#
i_tab_stats$_obj#
i_ind_stats$_obj#
object_usage
These additional objects shall be re-classified (or) ignored by following methods.
1. Opening the database in migrate mode
2. Using event 38003
Event 38003 affects the bootstrap process of loading the fixed cache in  kqlblfc(). Per default certain objects are marked as bootstrap objects (even though they are not defined as such in sys.bootstrap$) but by setting the event they will be left as non-bootstrapped.

What is bootstrap process failure? or  ORA-00704

This ORA-00704 error SERIOUS if reported at startup. This error refers to some problem during bootstrap operation. Any ORA-00704 error on STARTUP / RECOVER is serious, this error normally rose due to some inconsistency with the bootstrap segments (or) data corruption on bootstrap$ (or) any of the base tables below object_id  56. After this error it might not allow to open that database.

When ORA-00704 shall occur?

1. There is a probable of this error when any unsupported operations are tried to force open the database.
2. This error can also occur when system datafile has corrupted blocks. (ORA-01578)
3. In earlier releases of oracle (prior to 7.3.4 and 8.0.3) this issue shall arise due to Bug 434596
The option is to restore it from a good backup and recover it.
-> If the underlying cause is physical corruption that is due to hardware problems then do complete recovery.
-> If the issue is not relating to any physical corruption, then the problem could be due some unsupported actions on Bootstrap, and a Point In Time Recovery would be an option in such cas.