Skip to content

Redo

介绍一个快速检查近日redo产生量以及等待判断是否需要增加redo size的脚本

通常我们定位一个数据库的redolog是否需要增加,都是通过观察online redolog的切换频率是否频繁,观察告警是否存在checkpoint incompleted,观察awr中的log file等待时间是否正常来判断是否需要增加日志大小或者日志组数,今天找到一个非常不错的脚本,分析过去几天内平均1小时产生的redo量,lgwr的等待方面的统计,也可以作为评估redo产生量的一个统计。

执行结果类似如下图:
redo_adjust

最后列为建议增加的值,还有ckpt的等待,总体的cpu开销,物理读和写消耗,日志等待事件,进程数等统计。

具体脚本内容如下:

 

set pagesize 50000 linesize 300

col instance_number             format 99                   head 'In|st'
col tim                                                     head 'Period end'
col cpu_sec                     format 999,999,999.9        head 'CPU used|sec'
col phy_reads                   format 999,999,999          head 'Physical|reads'
col phy_writes                  format 999,999,999          head 'Physical|writes'
col cr_served                   format 999,999,999          head 'CR blocks|served'
col current_served              format 999,999,999          head 'CUR blocks|served'
col redo_mb                     format 999,999,999.9        head 'Redo, MB'
col processes                   format 999,999              head 'Proc|esses'
col avg_df_seq                  format 9,999.9              head 'Avg 1|read'
col avg_df_scat                 format 9,999.9              head 'Avg N|read'
col redo_diff_to_md_pct         format 999,999              head 'Redo Diff|to median, %'
col avg_lfpw                    format 999.99               head 'Avg|LFPW'
col avg_log_sync                format 9,999.99             head 'Avg Log|Sync, ms'
col log_ckpt_sec                format 999,999              head 'CKPT|waits, s'
col redo_needed                 format 999,999              head 'Redo to|Add, MB'

compute max of cpu_sec          on instance_number
compute max of phy_reads        on instance_number
compute max of phy_writes       on instance_number
compute max of cr_served        on instance_number
compute max of current_served   on instance_number
compute max of phy_writes       on instance_number
compute max of redo_needed      on instance_number
compute max of log_ckpt_sec     on instance_number
compute max of avg_log_sync     on instance_number
compute max of avg_lfpw         on instance_number
compute max of redo_mb          on instance_number
compute max of processes        on instance_number
compute max of avg_df_seq       on instance_number
compute max of avg_df_scat      on instance_number

break on instance_number skip page

with t_interval as
(
 select /*+ inline */ sysdate-30 begin, sysdate as end from dual
)
select
  stats.dbid                                                                dbid,
  stats.instance_number                                                     instance_number,
  to_char(stats.begin_interval_time, 'YYYYMMDD HH24MI')                     tim,
  stats.cpu_used / 100                                                      cpu_sec,
  stats.phy_reads                                                           phy_reads,
  stats.phy_writes                                                          phy_writes,
  stats.cr_served                                                           cr_served,
  stats.current_served                                                      current_served,
  stats.redo_size / 1024 / 1024                                             redo_mb,
  procs.current_utilization                                                 processes,
--
  waits.df_seq_micro / 1000 / nullif(waits.df_seq_waits,0)                  avg_df_seq,
  waits.df_scat_micro / 1000 / nullif(waits.df_scat_waits,0)                avg_df_scat,
  (stats.redo_size - stats.md_redo_size) * 100 / stats.md_redo_size         redo_diff_to_md_pct,
  stats.redo_write_time*10/stats.redo_writes                                avg_lfpw,
  waits.log_sync_micro/nullif(waits.log_sync_waits, 0) / 1000               avg_log_sync,
  waits.log_ckpt_micro/1e6                                                  log_ckpt_sec,
  ( stats.redo_size /
     ( waits.snap_interval * 86400 ) ) *
   ( waits.log_ckpt_micro/1e6 ) / 1024 / 1024                               redo_needed,
  stats.is_restart
from
  (
   select
     snap_id,
     begin_interval_time,
     snap_interval,
     instance_number,
     dbid,
     log_sync_micro,
     log_sync_waits,
     log_ckpt_micro,
     log_ckpt_waits,
     df_seq_micro,
     df_seq_waits,
     df_scat_micro,
     df_scat_waits,
     direct_micro,
     direct_waits,
     median(log_sync_micro/nullif(log_sync_waits, 0)) over (partition by dbid, instance_number) md_log_sync_micro
   from
   (
      select
        snap_id,
        begin_interval_time,
        instance_number,
        dbid,
        max(snap_interval) snap_interval,
        max(decode(event_name, 'log file sync',                            wait_micro))    log_sync_micro,
        max(decode(event_name, 'log file sync',                            total_waits))   log_sync_waits,
        max(decode(event_name, 'log file switch (checkpoint incomplete)',  wait_micro))    log_ckpt_micro,
        max(decode(event_name, 'log file switch (checkpoint incomplete)',  total_waits))   log_ckpt_waits,
        max(decode(event_name, 'db file sequential read',                  wait_micro))    df_seq_micro,
        max(decode(event_name, 'db file sequential read',                  total_waits))   df_seq_waits,
        max(decode(event_name, 'db file scattered read',                   wait_micro))    df_scat_micro,
        max(decode(event_name, 'db file scattered read',                   total_waits))   df_scat_waits,
        max(decode(event_name, 'direct path read',                         wait_micro))    direct_micro,
        max(decode(event_name, 'direct path read',                         total_waits))   direct_waits
      from
      (
        select
          e.snap_id,
          e.instance_number,
          e.dbid,
          sn.begin_interval_time,
          cast(begin_interval_time as date) - cast(lag(begin_interval_time) over (partition by e.dbid, e.instance_number, e.event_name order by sn.begin_interval_time) as date) snap_interval,
          sn.startup_time,
          e.event_name,
          case when (sn.begin_interval_time >= sn.startup_time and lag(sn.begin_interval_time) over (partition by e.dbid, e.instance_number, e.event_name order by sn.begin_interval_time) < sn.startup_time)
            then e.time_waited_micro
            else e.time_waited_micro - lag(e.time_waited_micro) over (partition by e.dbid, e.instance_number, e.event_name order by sn.begin_interval_time)
          end wait_micro,
          case when (sn.begin_interval_time >= sn.startup_time and lag(sn.begin_interval_time) over (partition by e.dbid, e.instance_number, e.event_name order by sn.begin_interval_time) < sn.startup_time)
            then e.total_waits
            else e.total_waits - lag(e.total_waits) over (partition by e.dbid, e.instance_number, e.event_name order by sn.begin_interval_time)
          end total_waits
        from
          dba_hist_system_event e,
          dba_hist_snapshot     sn,
          t_interval            t
        where
          sn.snap_id = e.snap_id and
          sn.dbid = e.dbid and
          sn.instance_number = e.instance_number and
          sn.begin_interval_time between t.begin and t.end and
          e.event_name in (
            'log file sync',
            'log file switch (checkpoint incomplete)',
            'db file sequential read',
            'db file scattered read',
            'direct path read'
          )
      )
      group by dbid, instance_number, begin_interval_time, snap_id
    )
  ) waits,
  (
    select
      snap_id,
      begin_interval_time,
      instance_number,
      dbid,
      redo_size,
      redo_write_time,
      redo_writes,
      is_restart,
      cpu_used,
      phy_reads,
      phy_reads_cache,
      phy_writes,
      phy_writes_cache,
      cr_served,
      current_served,
      median(redo_size) over (partition by dbid, instance_number) md_redo_size
    from
    (
      select
        snap_id,
        begin_interval_time,
        instance_number,
        dbid,
        max(is_restart) is_restart,
        max(decode(stat_name, 'redo size',                       stat_diff)) redo_size,
        max(decode(stat_name, 'redo write time',                 stat_diff)) redo_write_time,
        max(decode(stat_name, 'redo writes',                     stat_diff)) redo_writes,
        max(decode(stat_name, 'CPU used by this session',        stat_diff)) cpu_used,
        max(decode(stat_name, 'physical read total IO requests', stat_diff)) phy_reads,
        max(decode(stat_name, 'physical reads cache',            stat_diff)) phy_reads_cache,
        max(decode(stat_name, 'physical write total IO requests',stat_diff)) phy_writes,
        max(decode(stat_name, 'physical writes from cache',      stat_diff)) phy_writes_cache,
        max(decode(stat_name, 'gc cr blocks served',             stat_diff)) cr_served,
        max(decode(stat_name, 'gc current blocks served',        stat_diff)) current_served
      from
      (
        select
          stats.snap_id,
          stats.instance_number,
          stats.dbid,
          sn.begin_interval_time,
          sn.startup_time,
          stats.stat_name,
          case when (sn.begin_interval_time >= sn.startup_time and lag(sn.begin_interval_time) over (partition by stats.dbid, stats.instance_number, stats.stat_id order by sn.begin_interval_time) < sn.startup_time)
            then stats.value
            else stats.value - lag(stats.value) over (partition by stats.dbid, stats.instance_number, stats.stat_id order by stats.snap_id)
          end stat_diff,
          case when (sn.begin_interval_time >= sn.startup_time and lag(sn.begin_interval_time) over (partition by stats.dbid, stats.instance_number, stats.stat_id order by sn.begin_interval_time) < sn.startup_time)
            then 'Yes'
          end is_restart
        from
          dba_hist_sysstat   stats,
          dba_hist_snapshot  sn,
          t_interval         t
        where
          sn.snap_id = stats.snap_id and
          sn.dbid = stats.dbid and
          sn.instance_number = stats.instance_number and
          sn.begin_interval_time between t.begin and t.end and
          stats.stat_name in (
            'redo size',
            'redo write time',
            'redo writes',
            'CPU used by this session',
            'physical read total IO requests',
            'physical reads cache',
            'physical write total IO requests',
            'physical writes from cache',
            'gc cr blocks served',
            'gc current blocks served'
          )
      )
      group by dbid, instance_number, begin_interval_time, snap_id
    )
  ) stats,
  (
    select
      stats.snap_id,
      stats.instance_number,
      stats.dbid,
      stats.resource_name,
      stats.current_utilization
    from
      dba_hist_resource_limit stats,
      dba_hist_snapshot       sn,
      t_interval              t
    where
      sn.snap_id = stats.snap_id and
      sn.dbid = stats.dbid and
      sn.instance_number = stats.instance_number and
      sn.begin_interval_time between t.begin and t.end and
      stats.resource_name = 'processes'
  ) procs
where
  waits.dbid = stats.dbid and
  waits.instance_number = stats.instance_number and
  waits.snap_id = stats.snap_id and
  waits.dbid = procs.dbid and
  waits.instance_number = procs.instance_number and
  waits.snap_id = procs.snap_id
order by
 stats.dbid, stats.instance_number, stats.begin_interval_time
;

该脚本转自pythian.

3.理解redo log file header,redo block header and redo record header

昨天有人问我redo log file header,redo block header, redo record header的区别,这三个确实容易混淆,简单聊聊这三个概念,This is importance.
以16进制格式来分析.偶尔会穿插点10进制的dump格式.
一.redo log file header

redo log file header即为重做日志文件头,文件头的内容部分其实包含2个,一个为file header block,一个为redo log header block.可能这么说还有点含糊,那么看下面这张图:

3

如上图所示,一个redo log文件的头部有2个块,第一个块为file header block,第二个块为redo log header block.下面大致的分析下每个块的存储内容.
截取第一个块的内容如下(我用UE打开格式):

log1

10进制的dump格式:

DUMP OF REDO FROM FILE '/oradata/ora10g/lu10g/redo03.log'
 Opcodes *.*
 RBAs: 0x000000.00000000.0000 thru 0xffffffff.ffffffff.ffff
 SCNs: scn: 0x0000.00000000 thru scn: 0xffff.ffffffff
 Times: creation thru eternity
 FILE HEADER:
	Compatibility Vsn = 169870080=0xa200300
	Db ID=2561616225=0x98af2961, Db Name='LU10G'
	Activation ID=2561579617=0x98ae9a61
	Control Seq=564=0x234, File size=102400=0x19000
	File Number=3, Blksiz=512, File Type=2 LOG
 descrip:"Thread 0001, Seq# 0000000009, SCN 0x00000099c24d-0x0000009a40c4"
 thread: 1 nab: 0x16910 seq: 0x00000009 hws: 0x4 eot: 0 dis: 0
 resetlogs count: 0x31bca923 scn: 0x0000.000716f7 (464631)
 resetlogs terminal rcv count: 0x0 scn: 0x0000.00000000
 prev resetlogs count: 0x268ea86b scn: 0x0000.00000001 (1)
 prev resetlogs terminal rcv count: 0x0 scn: 0x0000.00000000
 Low  scn: 0x0000.0099c24d (10076749) 04/15/2014 12:08:04
 Next scn: 0x0000.009a40c4 (10109124) 04/16/2014 06:56:19
 Enabled scn: 0x0000.000716f7 (464631) 12/17/2013 23:00:51
 Thread closed scn: 0x0000.0099c24d (10076749) 04/15/2014 12:08:04
 Disk cksum: 0xeb7d Calc cksum: 0xeb7d
 Terminal recovery stop scn: 0x0000.00000000
 Terminal recovery  01/01/1988 00:00:00
 Most recent redo scn: 0x0000.00000000
 Largest LWN: 2134 blocks
 End-of-redo stream : No
 Unprotected mode
 Miscellaneous flags: 0x0
 Thread internal enable indicator: thr: 0, seq: 0 scn: 0x0000.00000000

在ue打开的16进制中每行是16bytes,一共32行,这为第一个块file header block.这个block包含的信息有限,逐一分析:
从第一个字节开始到第二个字节为0x22,这2个字节通常为oracle 文件类型的标识,像数据文件的开头这里就为0xA2,这里的0x22就代表redo log file.接着下来第一行就没有有意义的东西了其中的0xFFC0应该和scn中的base部分有关.再看看第二行的16个字节,可以发现在第21和22字节0x0200换算成10进制格式则为512,这里即代表block size.从第25到28字节0x00019000换算成10进制后为102400,而0x00019000代表redo block的数量,这里即为代表这个file的blocks或者理解为这个file的size.而在后续跟着的0x7a7b7c7d为文件标识符,为oracle快速识别文件的一种标识.到这之后第一个块的信息就没啦!

接下来分析第二个块redo log header block,第二个块的信息是在文件中的第33行开始,具体如下:
log2

后续再更新吧,有点耗时间.

How to Dump Redo Log File Information

此文章摘自id 28989.1,为解析redo部分必读的一个文章,没翻译,大家凑合吧.
====================================================================
You are working with Oracle Technical Support. As part of the diagnostic
process, you have been asked to take a dump of the redo log files. The
information in the logs is often used to help diagnose corruption issues.

The following commands will be used in this process:

1. The ‘alter session’ command is used to dump redo headers.

2. Use the ‘alter system dump logfile’ to dump log file contents.

This command requires ‘ALTER SYSTEM’ system privilege. The database can be in
mount, nomount or open state when the command is issued. An online log file
or an archived log file can be dumped. It is even possible to dump a
file from another database, as long as the operating systems are the same.

Output from the command is put into the session’s trace file.

The following ways of dumping a redo log file are covered:

1. To dump records based in DBA (Data Block Address)
2. To dump records based on RBA (Redo Block Address)
3. To dump records based on SCN
4. To dump records based on time
5. To dump records based on layer and opcode
6. Dump the file header information
7. Dump an entire log file:

1. To dump records based on DBA (Data Block Address)
————————————————–

This will dump all redo records for the range of data
blocks specified for a given file # and block # range.

From sqlplus (sqldba or svrmgr for older versions), issue the following command:

ALTER SYSTEM DUMP LOGFILE ‘filename’
DBA MIN fileno . blockno
DBA MAX fileno . blockno;

Example:
========
ALTER SYSTEM DUMP LOGFILE ‘u01/oracle/V7323/dbs/arch1_76.dbf’
DBA MIN 5 . 31125
DBA MAX 5 . 31150;

This will cause all the changes to the specified range of data blocks to be
dumped to the trace file. In the example given, all redo records for file #5,
blocks 31125 thru 31150 are dumped.

Note
====
For 10g:
ALTER SYSTEM DUMP LOGFILE ‘u01/oracle/V7323/dbs/arch1_76.dbf’
DBA MIN 5 . 31125 DBA MAX 5 . 31150;

will raise:
ORA-01963: Must specify a block number

In 10g we need to skip the dot ‘.’ while doing the redo dumps
ALTER SYSTEM DUMP LOGFILE ‘u01/oracle/V7323/dbs/arch1_76.dbf’
DBA MIN 5 31125 DBA MAX 5 31150;

@ As per Bug 3536989

2. To dump records based on RBA (Redo Block Address)
————————————————-

This will dump all redo records for the range of redo
addresses specified for the given sequence number and block number.

Syntax:
ALTER SYSTEM DUMP LOGFILE ‘filename’
RBA MIN seqno . blockno
RBA MAX seqno . blockno;

Example:
ALTER SYSTEM DUMP LOGFILE ‘u01/oracle/V7323/dbs/arch1_76.dbf’
RBA MIN 2050 . 13255
RBA MAX 2255 . 15555;

3. To dump records based on SCN
—————————-

Using this option will cause redo records owning changes within the SCN range
specified to be dumped to the trace file.

ALTER SYSTEM DUMP LOGFILE ‘filename’
SCN MIN minscn
SCN MAX maxscn;

Example:
ALTER SYSTEM DUMP LOGFILE ‘u01/oracle/V7323/dbs/arch1_76.dbf’
SCN MIN 103243
SCN MAX 103294;

If the purpose is to check the dumpfile you can rather do the following,
SQL> ALTER SYSTEM DUMP LOGFILE ‘filename’ SCN MIN 1 SCN MAX 1;

If the above completes sucessfully it ensures no issues with the archivelog.

4. To dump records based on time.
——————————

Using this option will cause redo records created within the time range
specified to be dumped to the trace file.

From sqlplus (sqldba or svrmgr for older versions), issue the following command:

ALTER SYSTEM DUMP LOGFILE ‘filename’
TIME MIN value
TIME MAX value;

Example:
========
ALTER SYSTEM DUMP LOGFILE ‘u01/oracle/V7323/dbs/arch1_76.dbf’
TIME MIN 299425687
TIME MAX 299458800;

Please Note: the time value is given in REDO DUMP TIME

@See Note 34026.1 for SQL script on converting date/time to REDO DUMP time.

5. To dump records based on layer and opcode.
——————————————

LAYER and OPCODE are used to dump all log records for a particular type of
redo record, such as all dropped row pieces.

From sqlplus (sqldba or svrmgr for older versions), issue the following command:

ALTER SYSTEM DUMP LOGFILE ‘filename’
LAYER value
OPCODE value;

Example:
========
ALTER SYSTEM DUMP LOGFILE ‘u01/oracle/V7323/dbs/arch1_76.dbf’
LAYER 11
OPCODE 3;

@Note: See Note 29733.1 for the list of LAYER and OPCODE values and their
@meaning.

6. Dump the file header information:
———————————

This will dump file header information for every
online redo log file.

From sqlplus (sqldba or svrmgr for older versions), issue the following command:

alter session set events ‘immediate trace name redohdr level 10’;

For dumping archivelog header,issue the following command:

ALTER SYSTEM DUMP LOGFILE ‘filename’ RBA MIN 1 1 RBA MAX 1 1;

7. Dump an entire log file:
————————

From sqlplus (sqldba or svrmgr for older versions), issue the following command:

ALTER SYSTEM DUMP LOGFILE ‘filename’;

Please note:
Fully qualify the filename, and include the single quotes.

Example:
========
ALTER SYSTEM DUMP LOGFILE ‘u01/oracle/V7323/dbs/arch1_76.dbf’;

The dump of the logfile will be written into a trace file in the udump destination.
Use the command ‘show parameters dump’ within an sqlplus session.
The ouput will show the location for the udump destination where
the trace file exists.

@ References:
@ ===========
@ Note 28989.1

4.解析Redo:change vectors

声明:我写这一系列文章纯作为个人兴趣研究,只代表个人观点,如有误欢迎大家斧正.
在说明vector之前我补充要对redo log file header以及redo block header还有redo record header做个区分

change vector顾名思义变更向量,redo record中除了header之外,存储的就是change vector.它

2.解析REDO:Records

前面一节讲到,一个redo logfile中包含了2部分内容,分别为Redo file header,Redo record。这里将对这2部分进行解析。
1.2.1 Redo logfile header
重做日志文件头的大小为512byte,在一个logfile的dump中可以在file header中找到 BLKSIZ=512.日志文件头主要记录的内容为当前日志的seq,rba,scn,logsize,log number,DBID,low rba,highrba等信息。(如下图)

Opcodes *.*
RBAs: 0x000000.00000000.0000 thru 0xffffffff.ffffffff.ffff
SCNs: scn: 0x0000.00000000 thru scn: 0xffff.ffffffff
Times: creation thru eternity
FILE HEADER:
Compatibility Vsn = 185597952=0xb100000
Db ID=3093517514=0xb86354ca, Db Name='11GOCMDB'
Activation ID=3093532362=0xb8638eca
Control Seq=1033=0x409, File size=102400=0x19000
File Number=2, Blksiz=512, File Type=2 LOG
descrip:"Thread 0001, Seq# 0000000014, SCN 0x0000000c76be-0xffffffffffff"
thread: 1 nab: 0xffffffff seq: 0x0000000e hws: 0x2 eot: 1 dis: 0
resetlogs count: 0x2f85bc4c scn: 0x0000.00081bca (531402)
prev resetlogs count: 0x27a409d9 scn: 0x0000.00000001 (1)
Low scn: 0x0000.000c76be (816830) 11/20/2013 23:37:48
Next scn: 0xffff.ffffffff 01/01/1988 00:00:00
Enabled scn: 0x0000.00081bca (531402) 10/21/2012 22:27:24
Thread closed scn: 0x0000.000c76be (816830) 11/20/2013 23:37:48
Disk cksum: 0xd4a3 Calc cksum: 0xd4a3
Terminal recovery stop scn: 0x0000.00000000
Terminal recovery 01/01/1988 00:00:00
Most recent redo scn: 0x0000.00000000
Largest LWN: 0 blocks
End-of-redo stream : No
Unprotected mode
Miscellaneous flags: 0x800000
Thread internal enable indicator: thr: 0, seq: 0 scn: 0x0000.00000000

1.2.2 Redo record header
Redo record主要由2部分组成,分别为redo record header和change vectors,在这里把change vectors我单独出来一章分析,因为在这一部分oracle在10g的前后做了比较大的变化(IMU),原理不同(1.1.5中有描述),这部分只讲解redo record header.

Redo record header 的内容大致如下:

REDO RECORD - Thread:1 RBA: 0x00000e.00000003.0010 LEN: 0x02a4 VLD: 0x05
SCN: 0x0000.000c76c2 SUBSCN: 1 11/20/2013 23:37:49

可以看到redo record header的内容主要包含信息为RBA,LEN,SCN,SUBSCN,THREAD等信息。因为这RBA和SCN个内容和redo标识中最重要的组成部分,下来对上面的RBA和SCN号做详细的解析

1. RBA(redo byte address)
在redo logfile中,没一个redo record都对应有一个RBA,记录这个record所处在的位置,RBA为10个字节大小,由三个部分组成,
分别为:
日志文件序列号(4字节)
日志文件块编号(4字节)
重做日志记录在日志块中的起始偏移字节数(2字节)

RBA: 0x00000e.00000003.0010

日志文件序列号:0x00000e
日志文件块编号:00000003
偏移字节数:0010
通常使用RBA的形式有:
LRBA (LOW RBA)

Redo log file中的LRBA (Low Redo Block Address) 表示脏数据块在buffer cache中第
一次被修改时候所对应的重做条目在重做日志文件中的地址 (Low表示第一次修改)
[LRBA0][LRBA1][LRBA2][LRBA3][LRBA4][LRBA5][LRBA6]

HRBA (HIGH RBA)

数据缓存(buffer cache)中一个脏块最近一次被更新的时候产生的重做日志记录在
重做日志文件中所对应的位置就称为HRBA。

checkpoint RBA

当一个checkpoint事件发生的时候,checkpoint进程会记录下当时所写的重做日志
记录的地址即RBA,此时记录的RBA被称为checkpoint RBA。从上一个checkpoint RBA
到当前的checkpoint RBA之间的日志所保护的buffer cache中的脏块接下来将会被写
入到数据文件当中去。

2. SCN(system change number)
也称作system commit number,一共有6字节长,格式为如下:

SCN: 0x0000.000c76c2

由2部分组成,

WRAP:0x0000(2bytes)
BASE:000c76c2(4bytes)

Base部分记录着当前数据库最新的scn,一般保存在record header中,rac中记录的为全局scn,分布式事务记录的为high scn。

关于scn更详细的内容可以参考
http://czmmiao.iteye.com/blog/1010267

1.谈谈对redo的理解

作为这一系列文章的开头,我想说一些问题,它们是我在学习oracle到这一步入门时候所积累的一些经验。
任何的技术学习都有一个目的,学习oracle也一样。在学习到redo的时候你又是怎么样的目的?这里不妨我提出几个问题,作为思考的方向。

1.redo 日志是什么?
2.redo 有何作用?
3.redo 直接有关的进程有哪些?
4.redo 和内存中的哪块区域有关?
5.redo 相关的字典或者视图有哪些,redo记录了哪类信息?
6.redolog的结构是怎么样的?
7.在数据库中怎么统计一个事务的相关redo量?

关于redo重做日志的概念可以参考《concepts》里1.1.3.3部分的介绍.我这里将对redo作主要特征以及运行原理的理解阐述。

一:redo的介绍

redo日志它记录了对database的所有变更操作的记录,这些在数据库中的变更操作记录叫做重做日志条目(redo entry).redo日志主要应用在Recover(实例恢复以及介质恢复),dataguard,goldengate,stream等恢复以及数据同步的环境中。

二:redo entry有3个主要特性:

2.1.在实例中,一个变更操作在数据库加载在buffercache之前或者实际的block操作之前就已经在log buffer中生成相关的redo entry(重做日志条目)
2.2.在将变更的块写入datafile之前,相关事务的redo entry必须先由lgwr进程写入到redo logfile中
2.3.当一个事务提交的时候,logbuffer中的redo entry写入到redo log file

(这里原理涉及到了checkpoint以及增量checkpoint的机制,在这个系列的checkpoint中将会结合oracle在redo设计上为了保护crash方面作详细解说.)
三:redo entry的生成:
3.1:10g中的写redo log buffer的步骤(10g的新特性,即IMU,in memory undo)

1. 通过获得在private memory中的配对的内存结构,开始事务。配对的内存结构一个是从x$ktifp(即in memory undo pool,负责undo变化矢量),一个是从x$kcrfstrand(即private redo,负责forward变化矢量)。
2.标记每个受影响的block为“has private redo”(但是不改变block)
3.将每个undo变化矢量写入被选中的in memory undo pool
4.将每个redo变化矢量写入被选中的private redo thread
5.通过将上述的2个内存结构串联到一个redo change记录中,事务完成(shared pool)。
6.将redo change记录拷贝到redo中,并且将change apply到block上。

1

3.2:10g之前版本的redo record 生成步骤
  datablock在block header分配ITL。
  生成undo的change vector(PGA)
  生成data的change vector(pga)
  将undo的redo vector和data的redo vector合并成一条change record。
  获得redo copy的latch
  获得redo allocation的latch
  将change record写入到log buffer中
  应用undo的change vector
  应用data的change vector
2

四 Redo log的存储相关
1. redo logfile是存储在系统中的,它使用的redo 块大小取决于操作系统的块大小。通常是512bytes,存储格式取决于操作系统和数据库版本。
2. 每一个redo logfile包含了日志文件头,重做日志条目
3. redo logfile的重做日志条目是顺序写的方式
一个redo logfile的基础结构如下图:
3

浅谈Oracle的恢复机制以及各种恢复手段(更新中)

一直都要把redo,undo在recovery中是怎么协调作用的东西写下来,一直都往后推拖,到了现在稍微有了点时间,也就顺便把这一系列的东西做了一个整理.
我这些文章中对log的dump操作请参考文章
How to Dump Redo Log File Information

大概会有3个大系列,分为如下部分:

1.谈谈redo

1.1.谈谈redo以及Imu的理解
1.2.什么是redo Records
1.3.理解redo log file header,redo block header and redo record header
1.4.什么是change vectors
1.5.理解Row and Index Operations
1.6.理解Direct Loads以及Nologging 在redo中的展现
1.7.理解审计与Redo之间的关系
1.8.理解附加日志内幕
(www.ludatou.com)

2:谈谈undo机制

2.1.理解undo的作用
2.2.理解事务与undo的关系
2.3.理解XID
2.4.理解UBA
2.5.理解UNDO CHAIN实现机制
2.6.理解延迟块清除
2.7.关于ora-01555的处理
(www.ludatou.com)

 

3.谈谈检查点机制

3.1.

4.谈谈控制文件

4.1理解控制文件的作用
4.2理解bootstrap

5.谈谈恢复原理以及特殊恢复技术

5.1.概念:实例恢复,介质恢复,崩溃恢复的概念
13.深入理解recovery的实现机制:Page Fix、写日志优先、Checkpoint
(www.ludatou.com)
5.1.1无法启动故障根本原因的定位案例
5.2.ora 00600 [2662]处理
5.3.ora 00600 [4097]一例
5.4.ora 00600 [4193]处理
—-BBED处理
5.5.数据文件头损坏恢复
5.6.控制文件损坏恢复
5.7.系统表空间损坏恢复

 

 

redo 日志产生量的监控以及测试

用实验说明
一、在非归档模式下:
view plaincopy
SQL> archive log list
数据库日志模式 非存档模式
自动存档 禁用
存档终点 USE_DB_RECOVERY_FILE_DEST
最早的联机日志序列 2491
当前日志序列 2493

用sys用户创建查询redo size的视图(方便查询)
view plaincopy
SQL> create or replace view redo_size
2 as
3 select value
4 from v$mystat, v$statname
5 where v$mystat.statistic# = v$statname.statistic#
6 and v$statname.name = ‘redo size’;

视图已创建。
用sys用户创建同义词
view plaincopy
SQL> create public synonym redo_size for redo_size;

同义词已创建。

以下用scott操作
创建测试表
view plaincopy
SQL> create table test_redos as select * from dba_objects where 1=2;

表已创建。

查看当前redo量
view plaincopy
SQL> select * from redo_size;

VALUE
———-
736

插入数据,看结果
view plaincopy
SQL> insert into test_redos select * from dba_objects;

已创建73104行。

SQL> select * from redo_size;

VALUE
———-
8473536

SQL> insert /*+ append */ into test_redos select * from dba_objects;

已创建73100行。

SQL> select * from redo_size;

VALUE
———-
8504856

SQL> select (8473536-736)普通插入,(8504856-8473536) append插入 from dual;

普通插入 APPEND插入
———- ———-
8472800 31320

以上结果说明在非归档模式下,append插入数据产生的redo要少得多。

二、在归档模式下(在数据库和表空间级别为设置force logging的情况下,默认非force logging):
view plaincopy
SQL> archive log list;
Database log mode Archive Mode
Automatic archival Enabled
Archive destination /archive1
Oldest online log sequence 114
Next log sequence to archive 116
Current log sequence 116

同上(非归档里面)建立测试表
①:在表为logging的情况下
view plaincopy
SQL> create table test_redos as select * from dba_objects where 1=2;

Table created.

SQL> select * from redo_size;

VALUE
———-
26812

SQL> insert into test_redos select * from dba_objects;

71971 rows created.

SQL> select * from redo_size;

VALUE
———-
8490044

SQL> insert /*+ append */ into test_redos select * from dba_objects;

71971 rows created.

SQL> select * from redo_size;

VALUE
———-
17001396

SQL> select (8490044-26812)普通插入,(17001396-8490044) append插入 from dual;

普通插入 APPEND插入
———- ———-
8463232 8511352

可以看出在归档模式表logging(默认)的情况下,append插入产生的redo量并不会减少。
②:在表nologging的情况下
将表设置为nologging模式
view plaincopy
SQL> alter table test_redos nologging;

Table altered.
继续测试
view plaincopy
SQL> select * from redo_size;

VALUE
———-
8397184

SQL> insert into test_redos select * from dba_objects;

71971 rows created.

SQL> select * from redo_size;

VALUE
———-
16801072

SQL> insert /*+ append */ into test_redos select * from dba_objects;

71971 rows created.

SQL> select * from redo_size;

VALUE
———-
16836516

SQL> select (16801072-8397184)普通插入,(16836516-16801072) append插入 from dual;

普通插入 APPEND插入
———- ———-
8403888 35444

可以看出在表nologging的模式下,append可以减少大量减少redo量的产生。
三、在归档force logging模式下:
改变SCOTT用户的默认表空间为force logging模式
view plaincopy
SQL> select username,default_tablespace from dba_users where username=’SCOTT’;

USERNAME DEFAULT_TABLESPACE
—————————— ——————————
SCOTT USERS
–在数据级别置为force logging模式语句为 alter database force logging;
SQL> alter tablespace users force logging;

Tablespace altered.
继续测试
view plaincopy
SQL> select * from redo_size;

VALUE
———-
25488368

SQL> insert into test_redos select * from dba_objects;

72010 rows created.

SQL> select * from redo_size;

VALUE
———-
33973556

SQL> insert /*+ append */ into test_redos select * from dba_objects;

72010 rows created.

SQL> select * from redo_size;

VALUE
———-
42492396

SQL> select (33973556-25488368)普通插入,(42492396-33973556) append插入 from dual;

普通插入 APPEND插入
———- ———-
8485188 8518840

可以看出在表空间在force logging模式下append不能减少redo量

_disable_logging对数据库产生的影响

测试下_disable_logging将对数据库产生的影响,由于是隐含参数,所以通过如下方法获取对这个参数的描述:

SQL> select ksppinm,ksppdesc from x$ksppi where ksppinm like '%logging';

KSPPINM              KSPPDESC
------------------------------------
_disable_logging     Disable logging
 

将其改为ture,也就是启用了不记录日志的方式:

SQL> alter system set "_disable_logging"=true scope=both;
System altered.
 

创建一个,并模拟事务运行,生成大量的redo,

SQL> create table mm tablespace marvener as select * from dba_objects;
Table created.

SQL> insert into mm  select * from dba_objects;
45167 rows created.

SQL> /
 45167 rows created.

此时模拟掉电,shutdown abort关闭数据库:

SQL> shutdown abort
ORACLE instance shut down.
SQL> startup
ORACLE instance started.

Total System Global Area 1610612736 bytes
Fixed Size                  2084296 bytes
Variable Size             385876536 bytes
Database Buffers         1207959552 bytes
Redo Buffers               14692352 bytes
Database mounted.
ORA-00354: corrupt redo log block header
ORA-00353: log corruption near block 81435 change 856029 time 01/30/2012
15:50:39
ORA-00312: online log 1 thread 1: '/u01/app/oracle/oradata/marven/redo01.log'
 

如上。可以发现数据库无法正常打开,并提示重做日志块头损坏,在告警中可见大量的告警,
即使通过Resetlogs方式打开数据库:

alter system set "_allow_resetlogs_corruption"=true scope=spfile;
shutdown abort
startup
 

数据库仍然会显然如下告警,并强制关闭实例:

SMON: enabling cache recovery
Mon Jan 30 16:15:41 2012
Errors in file /u01/app/oracle/admin/marven/udump/marven_ora_2900.trc:
ORA-00600: internal error code, arguments: [2662], [0], [855728], [0], [855937], [8388649], [], []
Mon Jan 30 16:15:42 2012
Errors in file /u01/app/oracle/admin/marven/udump/marven_ora_2900.trc:
ORA-00600: internal error code, arguments: [2662], [0], [855728], [0], [855937], [8388649], [], []
Mon Jan 30 16:15:42 2012
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 2900
ORA-1092 signalled during: alter database open resetlogs...
Mon Jan 30 16:16:34 2012
  

然而可以通过推进Oracle的SCN来解决此类问题:

增进SCN有几种常用方法,这里介绍其中2种:
1.通过immediate trace name方式(在数据库Open状态下)
alter session set events ‘IMMEDIATE trace name ADJUST_SCN level x’;
2.通过10015事件(在数据库无法打开,mount状态下)
alter session set events ‘10015 trace name adjust_scn level x’;
注:10015的推进级别有个计算算法,具体可以参考adjust的官方说明。

alter session set events '10015 trace name adjust_scn level 10';

SQL> alter database open;

Database altered.