异常恢复 – 第3页 – Ludatou's data life

10g undo tablespace was dropped. 回滚表空间误删除恢复一例.

Oracle Oracle恢复异常恢复数据技术 | Guang Cai Li | 2015-10-23

11g undo segment header corrupted.回滚段头损坏恢复一例

Oracle恢复异常恢复数据技术 | Guang Cai Li | 2015-10-23

浅谈Oracle数据库坏块( Database corruption part 2 )

坏块的构造,检测方法以及可能发生的Oracle文件

Oracle恢复异常恢复数据技术浅谈oracle技术系列 | Guang Cai Li | 2015-09-28 | corruption 坏块

浅谈Oracle数据库坏块( Database corruption part 1 )

关于坏块这一类的故障,从业这些年遇见得比较多,有的数据还能抢救,有的数据直接就丢了,更甚者数据库因此报废了. 一旦碰到,虽说不一定棘手,但是不免心里求佛一翻.

所谓Oracle数据库坏块顾名思义oracle数据库所在的数据存储介质的内容出现了讹误混乱或者无法访问.在Oracle自己的范围内对坏块有定义,这个可以参考文档[Note 840978.1].

坏块的种类有很多,它有可能因为各种原因出现在Oracle database的几种文件上面,报错也因此各有差异,处理的手段方法也各不相同,很早之前我就承诺过对这部分内容进行梳理更新到ludatou.com上,而现在我和陈辉正在对Mdata针对坏块部分的处理在做研发.所以趁此机会也尽量将文章更新到博客上,大部分上会是原创,有好的文章我也会转发.浅谈坏块这一系列的文章以方法和原理为主,除此之外我会把以往的坏块处理案例以日志方式更新,不作为浅谈部分文章.

浅谈坏块内容主要包含以下几部分:

本篇为坏块分类以及产生的主要原因

————————————————

1.坏块的主要分类,来自Oracle官方文档的解释,有助于更直接的了解区分物理坏块和逻辑坏块的区别(Note 840978.1有更加详细的描述):

For purposes of the paper we will categorize corruption under three general areasand give best practices for prevention, detection and repair for each:

Physicalor structural corruption can be defined as damage to internal data structureswhich do not allow Oracle software to find user data within the database. Logical corruption involves Oracle beingable to find the data, but the data values are incorrect as far as the end useris concerned.
Physica lcorruption due to hardware or software can occur in two general places — inmemory (including various IO buffers and the Oracle buffer cache) or on disk.Operator error such as overwriting a file can also be defined as a physicalcorruption. Logical corruption on theother hand is usually due to end-user error or non-robust(?) applicationdesign. A small physical corruption such as a single bit flip may be mistakenfor a logical error.

2.坏块的产生主要原因
坏块产生的原因很多 ,这里根据资料整理以及历史遭遇梳理出主要产生数据库坏块的原因.

2.2 操作系统BUG
由于Oracle进程对数据块的读写，都是以操作系统内核调用（system call）的方式完成的，如果操作系统在内核调用存在问题，必然导致Oracle进程写入非法的内容。

2.3 操作系统的I/O错误或缓冲问题
比如写丢失,io系统缓存掉电,io操作结果被截断

2.4 内存或paging问题
Oracle软件有较多的bug能导致坏块问题的出现,

2.5 非Oracle进程扰乱Oracle共享内存区域
如上文所述，在当数据块的内容被读入主机的物理内存时，如果其他非Oracle进程，对Oracle使用的共享内存区域形成了扰乱，最终导致写回磁盘的数据块内容混乱。

2.6 异常关机，掉电，终止服务
异常关机，掉电，终止服务使进程异常终止，也会导致坏块产生。

2.7 数据库操作人为失误
使用nologging选项操作后再做恢复后可能造成坏块,或者人为的破坏文件导致的坏块。

Oracle恢复异常恢复数据技术浅谈oracle技术系列 | Guang Cai Li | 2015-09-27 | corruption 坏块

Oracle坏块常见错误

ORA-1578	ORA-1578 is reported when a block is thought to be corrupt on read.
ORA-1410	This error is raised when an operation refers to a ROWID in a table for which there is no such row. The reference to a ROWID may be implicit from a WHERE CURRENT OF clause or directly from a WHERE ROWID=… clause. ORA 1410 indicates the ROWID is for a BLOCK that is not part of this table.
ORA-8103	The object has been deleted by another user since the operation began. If the error is reproducible, following may be the reasons:- a.) The header block has an invalid block type. b.) The data_object_id (seg/obj) stored in the block is different than the data_object_id stored in the segment header. See dba_objects.data_object_id and compare it to the decimal value stored in the block (field seg/obj).
ORA-8102	An ORA-08102 indicates that there is a mismatch between the key(s) stored in the index and the values stored in the table. What typically happens is the index is built and at some future time, some type of corruption occurs, either in the table or index, to cause the mismatch.
ORA-1498	Generally this is a result of an ANALYZE … VALIDATE … command. This error generally manifests itself when there is inconsistency in the data/Index block. Some of the block check errors that may be found:- a.) Row locked by a non-existent transaction b.) The amount of space used is not equal to block size c.) Transaction header lock count mismatch. While support are processing the tracefile it may be worth the re-running the ANALYZE after restarting the database to help show if the corruption is consistent or if it ‘moves’. Send the tracefile to support for analysis. If the ANALYZE was against an index you should check the whole object. Eg: Find the tablename and execute: ANALYZE TABLE xxx VALIDATE STRUCTURE CASCADE;
ORA-1499	An error occurred when validating an index or a table using the ANALYZE command. One or more entries does not point to the appropriate cross-reference.
ORA-26040	Trying to access data in block that was loaded without redo generation using the NOLOGGING/UNRECOVERABLE option. This Error raises always together with ORA-1578
ORA-600 [12700]	Oracle is trying to access a row using its ROWID, which has been obtained from an index. A mismatch was found between the index rowid and the data block it is pointing to. The rowid points to a non-existent row in the data block. The corruption can be in data and/or index blocks. ORA-600 [12700] can also be reported due to a consistent read (CR) problem.
ORA-600 [3020]	This is called a ‘STUCK RECOVERY’. There is an inconsistency between the information stored in the redo and the information stored in a database block being recovered.
ORA-600 [4194]	A mismatch has been detected between Redo records and rollback (Undo) records. We are validating the Undo record number relating to the change being applied against the maximum undo record number recorded in the undo block. This error is reported when the validation fails.
ORA-600 [4193]	A mismatch has been detected between Redo records and Rollback (Undo) records. We are validating the Undo block sequence number in the undo block against the Redo block sequence number relating to the change being applied. This error is reported when this validation fails.
ORA-600 [4137]	While backing out an undo record (i.e. at the time of rollback) we found a transaction id mis-match indicating either a corruption in the rollback segment or corruption in an object which the rollback segment is trying to apply undo records on. This would indicate a corrupted rollback segment.
ORA-600 [6101]	Not enough free space was found when inserting a row into an index leaf block during the application of undo.
ORA-600 [2103]	Oracle is attempting to read or update a generic entry in the control file. If the entry number is invalid, ORA-600 [2130] is logged.
ORA-600 [4512]	Oracle is checking the status of transaction locks within a block. If the lock number is greater than the number of lock entries, ORA-600 [4512] is reported followed by a stack trace, process state and block dump. This error possibly indicates a block corruption.
ORA-600 [2662]	A data block SCN is ahead of the current SCN. The ORA-600 [2662] occurs when an SCN is compared to the dependent SCN stored in a UGA variable. If the SCN is less than the dependent SCN then we signal the ORA-600 [2662] internal error.
ORA-600 [4097]	We are accessing a rollback segment header to see if a transaction has been committed. However, the xid given is in the future of the transaction table. This could be due to a rollback segment corruption issue OR you might be hitting the following known problem.
ORA-600 [4000]	It means that Oracle has tried to find an undo segment number in the dictionary cache and failed.
ORA-600 [6006]	Oracle is undoing an index leaf key operation. If the key is not found, ORA-00600 [6006] is logged. ORA-600[6006] is usually caused by a media corruption problem related to either a lost write to disk or a corruption on disk.
ORA-600 [4552]	This assertion is raised because we are trying to unlock the rows in a block, but receive an incorrect block type. The second argument is the block type received.
ORA-600[6856]	Oracle is checking that the row slot we are about to free is not already on the free list. This internal error is raised when this check fails.
ORA-600[13011]	During a delete operation we are deleting from a view via an instead-of trigger or an Index organized table and have exceeded a 5000 pass count when we raise this exception.
ORA-600[13013]	During the execution of an UPDATE statement, after several attempts (Arg [a] passcount) we are unable to get a stable set of rows that conform to the WHERE clause.
ORA-600[13030]
ORA-600[25012]	We are trying to generate the absolute file number given a tablespace number and relative file number and cannot find a matching file number or the file number is zero.
ORA-600[25026]	Looking up/checking a tablespace invalid tablespace ID and/or rdba found
ORA-600[25027]	Invalid tsn and/or rfn found
ORA-600 [kcbz_check_objd_typ_3]	An object block buffer in memory is checked and is found to have the wrong object id. This is most likely due to corruption.
ORA-600[kddummy_blkchk] &ORA-600[kdblkcheckerror]	ORA-600[kddummy_blkchk] is for 10.1/10.2 and ORA-600[kdblkcheckerror] for 11 onwards.
ORA-600[ktadrprc-1]
ORA-600[ktsircinfo_num1]	This exception occurs when there are problems obtaining the row cache information correctly from sys.seg$. In most cases there is no information in sys.seg$.
ORA-600[qertbfetchbyrowid]
ORA-600[ktbdchk1-bad dscn]	This exception is raised when we are performing a sanity check on the dependent SCN and fail. The dependent scn is greater than the current scn.

Oracle恢复异常恢复数据技术 | Guang Cai Li | 2015-09-26 | corruption 坏块

异常恢复 - 3. page

10g undo tablespace was dropped. 回滚表空间误删除恢复一例.

11g undo segment header corrupted.回滚段头损坏恢复一例

浅谈Oracle数据库坏块( Database corruption part 2 )

浅谈Oracle数据库坏块( Database corruption part 1 )

Oracle坏块常见错误

Oracle 恢复工具 Mdata 5.0.1 版本发布

近期文章

分类目录

扫码关注微信公众号:Oracle运维那些事获取定期发布的数据库运维的有趣事情!

近期活动

Oracle 恢复工具 Mdata 5.0.1 版本发布

近期文章

分类目录

扫码关注微信公众号:Oracle运维那些事 获取定期发布的数据库运维的有趣事情!

近期活动

扫码关注微信公众号:Oracle运维那些事获取定期发布的数据库运维的有趣事情!