Skip to content

Database - 28. page

通过undo record找到对应回滚对象信息

通过下面的语句查到回滚的事务:

select * from v$fast_start_transactions;

或者

select * from x$ktuxe where KTUXECFL='DEAD' AND KTUXESTA!='INACTIVE'

根据上面的语句,我们可以查到事务的undo的segment id(USN或者KTUXEUSN),undo的slot(SLT或者KTUXESLT),和undo的sequence(SEQ或者KTUXESQN)。
根据USN,我们可以查到undo segment:

select * from v$rollname where usn=xxx

根据上面的语句,我们可以dump出undo block:

alter system dump undo block "<undo segment name>" XID <USN> <SLT> <SEQ>;

alter system dump undo block "_SYSSMU33$" XID 33 56 7463;

然后根据dump出来的trace file,可以判断object number(objn)或者object id(objd)


cat xxx.trc |grep objn

* Rec #0x45 slt:0x1f objn:223312(0x00092c2a) objd:223312 tblspc:20(0x00000014)

* Rec #0x44 slt:0x1f objn:223312(0x00092c2a) objd:223312 tblspc:20(0x00000014)

* Rec #0x43 slt:0x1f objn:223312(0x00092c2a) objd:223312 tblspc:20(0x00000014)

* Rec #0x42 slt:0x1f objn:223312(0x00092c2a) objd:223312 tblspc:20(0x00000014)

* Rec #0x41 slt:0x1f objn:223312(0x00092c2a) objd:223312 tblspc:20(0x00000014)

* Rec #0x40 slt:0x1f objn:223312(0x00092c2a) objd:223312 tblspc:20(0x00000014)

* Rec #0x3f slt:0x1f objn:223312(0x00092c2a) objd:223312 tblspc:20(0x00000014)

...

此时,我们根据object id,可以在数据库里面找到回滚的object的对象:

select * from dba_objects where object_id='223312';

关于11g数据库基础审计的初始注意事项

目前11g已经有些年份了,大部分的系统已经迁移或者升级到11g的版本,而后续上线的大部分系统都是以11g为主.在11g以上的版本中数据库的基础审计和以往版本不同,11g以后的版本默认是开启的.基础审计这个功能大部分时候并不是很受关注,一来是比较简单就能设置完成,二来国内对信息数据安全的不够注重的环境也让很多dba在数据库安全层面未有太多落实.所以在基础设计这块在数据库构建之处也容易忽视.

审计在默认开启的情况下,主要影响是2方面:

    1.审计记录在长久记录后,会占用较多的空间,而审计记录表默认是存放在system表空间.
    2.在审计记录膨胀后,容易对一些类型的应用产生性能上的影响.

在初始构建11g实例时候,对审计处理并不是一味的关闭处理,主要是考虑开启的必要性需求,以及如果开启后如何去管理审计记录.以前我遇见过这样的需求,主要是针对审计开启后的审计记录管理问题.关闭倒是容易,一条命令就可以结束.反倒是开启后对审计记录的管理.可以参考以下2个文章对审计记录表存放位置挪移到别的表空间,同时制定定期的删除策略.

从ASM直接读取数据的研究思考

从ASM直接读取数据分为2种情况考虑

1.直接读取datafile文件
2.直接读取datafile中的objects

直接读取datafile文件的原理,目前MDATA和AMDU均实现了此功能

先从第一种从ASM中直接读取datafile的情况,这种case也是比较简单的。Datafile在文件系统中的单位是8k块为单位,在从数据字典中读取对象时候从对应的块地址可以直接读文件获取,因此文件系统中用mdata,dul等工具恢复对象的实现方式上虽然复杂,但是相比asm要简单不少,ASM也是一种文件系统,只是是封装的文件系统,普通的操作系统不能直接访问asm中的对象,必须通过asm,asm中构成asm文件的单位是au,因此datafile也是由多个au组成,而au是根据asm的条带大小和条带深度分部的,datafile只能存放于ASM 的DG中,不能跨DG,每个DG由1个或者多个disk组成,在DG中组成datafile的au是根据条带算法分部在dg里面的多个盘里的,因此需要从asm中恢复出datafile,需要知道au的分部情况,每个datafile包含au的地址范围。而ASM中dg的file directory(文件分配表)则可以读到此信息,因此只要找到对应文件的file dir就能找到对应asm file的au分布信息。有兴趣可以研究下asm的1号文件,每个文件在1号文件中都有4k的au 分布信息,1号文件至少为2个AU,海波写过一篇文章专门介绍过用c语言实现从asm中读取datafile,当然我看到的版本只处理30M au分布信息的file diretory(第一个AU),如果文件number超过256则需要在下一个au中读取相关文件的au分布信息。到这里恢复datafile已经不是什么大问题。

直接读取datafile中的objects的原理思路

这个会比较麻烦,用一句话提示吧,数据字典固定同时构建一份au分布信息表类似X$KFFXP。

Oracle数据库的企业版,标准版,个人版之间的差异

其实Oracle也可以很便宜,标准版在很多情况下已经够用了,下面的信息是对比oracle数据库企业版,标准版,个人版之间的差异.

标准版比企业版主要缺少的功能如下:
1、不支持Oracle Data Guard.想要高可用性的客户,就不能选择标准版)
2、一些Online操作,比如Online index maintenance,Online table redefinition等不支持
3、备份和恢复的某些操作受限,比如不支持Block级别的恢复(Block-level media recovery),不支持并行备份和恢复(Parallel backup and recovery),多重备份(Duplexed backup sets)等等
4、Flashback功能,在标准版中Flashback Table,Flashback Database, Flashback Transaction Query都是不支持的 5、VPD(Virtual Private Database)不支持
6、FGA(Fine-grained auditing)不支持
7、Partitioning,分区也不支持
8、Data compression,比如表压缩,分区压缩都不支持
9、Bitmapped index and bitmapped join index,不支持,(如果是数据仓库系统就不能选择标准版了)
10、Export transportable tablespace,(注意,这里只是导入不支持),但是Import transportable tablespace,包括跨平台的import都是支持的,也就是说你如果选择了标准版,那么想把数据库简单地转移到其它平台上是比较费劲的事情了 11、一些并行操作都不支持,包括Parallel query,Parallel DML,Parallel index build,Parallel Data Pump export and import。
12、不支持Streams,又少了一个高可用性的可行性方案 13、不支持Advanced Replication的multimaster方式,再少一个高可用性方案,当然实体化视图的复制方案仍然是支持的 14、不支持Connection Manager

以下是官方描述的不同版本企业版、标准版之间的差异:

Differences between Enterprise, Standard and Personal Edition

Starting point: Note 465455.1

Content of this note:

Server
Version Note # ABSTRACT
====================================================================

8.1 NOTE.112591.1 Differences Between Enterprise, Standard and Personal Editions on Oracle 8.1

9.0 NOTE.161556.1 Differences between Oracle9i Personal, Standard and Enterprise on NT/2000

9.2 NOTE.269040.1 Differences Between Enterprise, Standard and Personal Editions on Oracle 9.2

10.1 NOTE.271886.1 Differences Between Different Editions of Oracle Database 10G

10.2 NOTE.465465.1 Differences Between Enterprise, Standard and Personal Editions on Oracle 10.2

11.1 Note.465460.1 Differences Between Enterprise, Standard and Personal Editions on Oracle 11.1

11.2 Note.1084132.1 Differences Between Enterprise, Standard and Personal Editions on Oracle 11.2

这里将11.2的差异说明贴出:

转到底部转到底部

I

APPLIES TO:

Oracle Database – Enterprise Edition – Version 11.2.0.1.0 to 11.2.0.3 [Release 11.2]
Information in this document applies to any platform.
***Checked for relevance on 25-Mar-2015***

DETAILS

Feature/Option SE1 SE EE Notes
High Availability
Oracle Fail Safe Y Y Y Windows only
Oracle RAC One Node N N Y Extra cost option
Oracle Data Guard—Redo Apply N N Y
Oracle Data Guard—SQL Apply N N Y
Oracle Data Guard—Snapshot Standby N N Y
Oracle Active Data Guard N N Y Extra cost option
Rolling Upgrades—Patch Set, Database, and Operating System N N Y
Online index rebuild N N Y
Online index-organized table organization N N Y ALTERTABLEMOVEONLINEoperations
Online table redefinition N N Y Using theDBMS_REDEFINITIONpackage
Duplexed backup sets N N Y
Block change tracking for fast incremental backup N N Y
Unused block compression in backups N N Y
Block-level media recovery N N Y
Lost Write Protection N N Y
Automatic Block Repair N N Y Requires Active Data Guard option
Parallel backup and recovery N N Y
Tablespace point-in-time recovery N N Y
Trial recovery N N Y
Fast-start fault recovery N N Y
Flashback Table N N Y
Flashback Database N N Y
Flashback Transaction N N Y
Flashback Transaction Query N N Y
Oracle Total Recall N N Y Extra cost option
Scalability
Oracle Real Application Clusters N Y Y Extra cost with EE, included with SE
Automatic Workload Management N Y Y Requires Oracle Real Application Clusters
Performance
Client Side Query Cache N N Y
Query Results Cache N N Y
PL/SQL Function Result Cache N N Y
In-Memory Database Cache N N Y Extra cost option
Database Smart Flash Cache N N Y Solaris and Oracle Enterprise Linux only
Support for Oracle Exadata Storage Server Software N N Y
Security
Advanced Security Option N N Y Extra cost option
Oracle Label Security N N Y Extra cost option
Virtual Private Database N N Y
Fine-grained auditing N N Y
Oracle Database Vault N N Y Extra cost option
Secure External Password Store N N Y
Development Platform
SQLJ Y Y Y Requires Oracle Programmer
Oracle Developer Tools for Visual Studio .NET Y Y Y Windows only
Microsoft Distributed Transaction Coordinator support Y Y Y Windows only
Active Directory integration Y Y Y Windows only
Native .NET Data Provider—ODP.NET Y Y Y Windows only
.NET Stored Procedures Y Y Y Windows only
Manageability
Oracle Change Management Pack N N Y Extra cost option
Oracle Configuration Management Pack N N Y Extra cost option
Oracle Diagnostic Pack N N Y Extra cost option
Oracle Tuning Pack N N Y Extra cost option, also requires the Diagnostic Pack
Oracle Provisioning and Patch Automation Pack N N Y Extra cost option
Oracle Real Application Testing N N Y Extra cost option
Database Resource Manager N N Y
Instance Caging N N Y
SQL Plan Management N N Y
VLDB, Data Warehousing, Business Intelligence
Oracle Partitioning N N Y Extra cost option
Oracle OLAP N N Y Extra cost option
Oracle Data Mining N N Y Extra cost option
Oracle Data Profiling and Quality N N Y Extra cost option
Oracle Data Watch and Repair Connector N N Y Extra cost option
Oracle Advanced Compression N N Y Extra cost option
Basic Table Compression N N Y
Bitmapped index, bitmapped join index, and bitmap plan conversions N N Y
Parallel query/DML N N Y
Parallel statistics gathering N N Y
Parallel index build/scans N N Y
Parallel Data Pump Export/Import N N Y
In-memory Parallel Execution N N Y
Parallel Statement Queuing N N Y
Transportable tablespaces, including cross-platform N N Y Import of transportable tablespaces supported into SE, SE1, and EE
Summary management—Materialized View Query Rewrite N N Y
Asynchronous Change Data Capture N N Y
Integration
Basic Replication Y Y Y SE1/SE: read-only, updateable materialized view
Advanced Replication N N Y Multi-master replication
Oracle Streams Y Y Y SE1/SE: no capture from redo
Database Gateways Y Y Y Separate product license
Messaging Gateway N N Y
Networking
Oracle Connection Manager N N Y Available via a custom install of the Oracle Database client, usually installed on a separate machineSee “Oracle Connection Manager” for more information
Infiniband Support N N Y
Content Management
Oracle Spatial N N Y Extra cost option
Semantic Technologies (RDF/OWL) N N Y Requires Oracle Spatial and the Oracle Partitioning option

_datafile_write_errors_crash_instance设置建议

该参数在11.2.0.2以前默认是false,在11.2.0.2后默认为true,作用为在出现io错误的时候数据库选择是offline出现io错误相关的datafile还是直接将instance crash.当为true时候,数据库在发生io错误时候会直接瘫痪.报错ORA-63999等,前阵子我碰到这样的错误,一般碰到此类错误都是从IO传输层,存储和系统网络之间找问题,该参数设置为FALSE或者TRUE只是从业务影响层面广度的考虑.所以一旦碰到IO错误,考调整此参数只是治标不治本,根源还需要从IO传输层的各层面找问题.可以考虑设置为false,减少因为io错误而导致影响的范围增加.

在考虑此参数时候多数已经是碰到IO错误了,所以此时候应该考虑下,数据库坏块的产生控制影响,对以下几个参数给予考虑:

DB_ULTRA_SAFE
DB_BLOCK_CHECKING
DB_LOST_WRITE_PROTECT
DB_BLOCK_CHECKSUM

建议设置参数db_ultra_safe为DATA_ONLY,会稍微增加数据库主机的消耗,以加强对数据的校验以便能及时发现问题,设置该参数会去自动修改对应的另外3个参数:

DB_BLOCK_CHECKING will be set to MEDIUM.(当前为FALSE)
DB_LOST_WRITE_PROTECT will be set to TYPICAL. (当前为TYPICAL)
DB_BLOCK_CHECKSUM will be set to FULL. (当前为NONE)

以下是国外一个工程师对该参数的建议:

Param ‘_datafile_write_errors_crash_instance’ , TRUE or FALSE?

Since 11.2.0.2 there’s a new parameter, “_datafile_write_errors_crash_instance” to prevent the intance to crash when a write error on a datafile occurs . But.. should I use this or not. The official text of this parameter:

This fix introduces a notable change in behaviour in that
from 11.2.0.2 onwards an I/O write error to a datafile will
now crash the instance.

Before this fix I/O errors to datafiles not in the system tablespace
offline the respective datafiles when the database is in archivelog mode.
This behavior is not always desirable. Some customers would prefer
that the instance crash due to a datafile write error.

This fix introduces a new hidden parameter to control if the instance
should crash on a write error or not:
_datafile_write_errors_crash_instance

With this fix:
If _datafile_write_errors_crash_instance = TRUE (default) then
any write to a datafile which fails due to an IO error causes
an instance crash.

If _datafile_write_errors_crash_instance = FALSE then the behaviour
reverts to the previous behaviour (before this fix) such that
a write error to a datafile offlines the file (provided the DB is
in archivelog mode and the file is not in SYSTEM tablespace in
which case the instance is aborted)
When you ask Oracle for advice, you get the following answer:

20+ years ago a feature was added to Oracle to offline a datafile when there was an error writing a dirty buffer to it and it was not part of the system tablespace. At that time it made sense to do this since neither RAC or even
OPS was implemented and storage arrays did not exist. Then the most likelycause of an I/O error was a problem with the direct attached disk drive holding the datafile. By offlining the datafile the database might be able to continue running. Customers assumed that a disk failure would require restoring a backup and doing a media recovery so taking the file offline might improve availability. High availability was not expected.
Today almost all customers use highly available storage arrays accessible from multiple hosts. Now most I/O errors are either transient or are local to the host that encounters them. Real disk failures are hidden by the storage array redundancy. Customers expect a disk failure to have no effect on the operation of the database.

Unfortunately the code to offline a datafile on an I/O error is still there. The effect is that an error on one node in a cluster offlines the datafile and usually takes down the entire application on all nodes or even crashes all instances if the problem is with an undo tablespace. For example dismounting a file system on one node in a cluster makes that node get I/O errors for the files on that file system. This makes a mistake on one node take down the entire cluster.

Offlining a datafile on a write error is even a problem with single instance. Most I/O errors today will go away if the database is restarted on another machine or if the current machine is rebooted. However if the I/O error took a datafile offline, then the administrator must do a media recovery to make the application function again. This is an unusual procedure that takes awhile.

If the database instances do not crash it takes longer for the administrator to find out that the application is not working even though the database appears to be up and running. This is a problem with both RAC and single
instance.

Question: One concern is that a failed datafile write to a non-critical tablespace will bring down the database when it occurs in the only open instance.

It is true that there may be some situations where taking the file offline would be better. On the other hand there are cases where crashing in single instance is better because rebooting the server or restarting the instance will bring it up sooner with no need for manual intervention. Since we have to choose without knowing much about the system we have to base our choice on the odds of the failure being one case or the other. Twenty years ago a datafile was on one disk, almost all I/O errors were disk failures and a disk failure always meant doing media recovery. In that situation taking the datafile offline was clearly the right thing to do, even if the tablespace was critical to the application – it was going to need media recovery in any case.

Today systems are much different.

– Storage arrays and mirroring mean that disk failures almost never require media recovery. I/O write errors usually stop happening when the system is reinitialized.
– Many customers have mechanisms like CRS to automatically restart the database, possibly on a different node.
Now it is much more likely that restarting the instance will resolve the problem without doing any media recovery, and it will happen automatically. The chance that the application can continue running with the offline datafile has always been slight, but when media recovery was going to be required anyway there was no harm in trying to offline the file. Now there is a lot of harm in offlining the file since it prevents automatic recovery and requires an administrator to perform tasks he is unfamiliar with. Today crashing the instance has a better chance of getting the application running sooner.
So Oracle advises to leave the parameter default (=TRUE) and use the new feature. The system is then more capable to recover without interfering needed of a DBA. But when someone has different experiences, feel free to comment on this post….