Skip to content

All posts by Guang Cai Li - 13. page

最近在某出版客户升级数据库到 19c 时,源库 pfile / spfile 中设置的 event 被删除。

查询metalink匹配bug为 BUG 30193505 – SOURCE EVENTS REMOVED BY DBUA IN UPGRADING TO 19C.

 

解决办法如下:

从以下链接下载并应用 Patch 30193505:

https://updates.oracle.com/download/30193505.html

在打完 Patch 后,可以使用 DBUA 的 –keepEvents 参数来保留 event

 

参考文章:DBUA does not retain database events after Oracle 19c upgrade (Doc ID 2618457.1)

19c 数据库使用DBCA升级小版本时候遭遇bug未保留之前设置的 events 的解决办法

归档日志间隙(Archive Gap)

 

归档日志间隙是在 Standby 端一系列丢失的重做日志,导致日志应用服务无法继续运行。这通常发生在 Standby 端无法从 Primary Database 接收重做日志或重做日志在 Standby Database 上不可用时。常见原因有:

  • 网络连接断开或者日志传输服务停止
  • Standby Database 不可用
  • 日志传输服务的配置错误
  • Standby 端的 IO 问题
  • 归档日志在应用到 Standby 前被手工删除
  • Primary 和 Standby 之间的网络带宽不足

一旦在 Standby Database 上存在归档间隙,Log Apply Services 就会卡住,直到日志间隙(Gap)被解决,例如。丢失的 Redo 被重新获取并且在 Standby 端可用。然后,日志应用服务可以选中它并继续处理。

 

解决日志间隙的方式

 

有4种方案来解决 Standby Database 上的日志间隙。这些方案在下面讨论。

自动日志间隙解决方案

自动日志间隙解决方案是由日志传输服务自动进行的。它会把当前正在传输的日志和最近收到的日志进行对比,如果有不匹配的情况出现,Standby 端的 RFS 进程就会检测到并自动发送 ARCH-RFS 心跳 Ping 请求来要求发送丢失的日志。这种类型的日志间隙解决方法使用了主数据库上的 log_archive_dest_n 中定义的 Service。另外 ARCH-RFS 心跳 Ping 可以通过对当前的日子序列号进行查询来检测日志间隙。如果存在日志间隙则仍通过ARCH-RFS 心跳 Ping 请求来解决它。在问题得到解决后,会通知日志传输进程(ARCH 或者 LGWR)。对于自动日志间隙解决方案,不需要额外的设置或者监控。

 

FAL (Fetch Archive Log)日志间隙解决方案

当一个归档日志在 Standby 数据库上被收到或者归档,它就会被注册到 Standby 的控制文件中。(您可以在物理 Standby 数据库上查询 v$archived_log 或在逻辑 Standby 数据库上查询 dba_logstdby_log 来获取这些注册信息)。如果这个文件因为某些原因丢失或者损坏(比如,它被意外删除),FAL 就会被调用来解决日志间隙问题。因为这些缺失的日志文件通常由 Standby 数据库上的日志应用服务检测到。它独立于日志传输服务,并且没有和主库之间的直接链接。要使用 FAL,必须在 Standby 数据库设置一个或者两个(11.2.0 之前的版本)初始化参数:

FAL_SERVER:设置 Oracle Net Service Name(TNS-Alias 或者 Connect Descriptor)指向用来获取丢失的归档日志的数据库。它可以是一个 Data Guard 环境的主库,或者是另一个备库,ArchiveLog Repository- 或者 Far Sync Standby (> Oracle 12.1.0) Database。可以指定多个 Service Names(逗号分隔)。FAL 会顺序的尝试这些数据库来解决日志间隙问题。

FAL_CLIENT (< Oracle 11.2.0):在 FAL_SERVER 数据库上设置 Oracle Net Service Name(TNS-Alias 或 Connect Descriptor)指向接收 REDO 的 Standby 数据库(比如,它是 FAL_SERVER 数据库需要发送 REDO 到的目标数据库)。确保这个 TNS-Alias 存在于 FAL_SERVER 数据库的 TNSNAMES.ORA 文件中。从 Oracle 11.2.0 开始,这个参数不再需要。但是需要确保 FAL_SERVER 数据库存在一个log_archive_dest_n 指向要解决日志间隙问题的 Standby 数据库。

当 Log ApplyServices 检测到日志间隙问题,它会发送一个 FAL 请求把 FAL_CLIENT 信息(Version > 11.1.0 则为 db_unique_name)给 FAL_SERVER。一个 FAL_SERVER 数据库上的 ARCH-Process 会尝试获取那个日志并发送回 FAL_CLIENT(或者 db_unique_name 对应的 Destination)。如果第一个 FAL_SERVER 无法解决日志间隙,会尝试列表中的下一个 FAL_SERVER。如果所有的 FAL_SERVERs 都无法解决,那么 FAL 请求会失败,并且一个对应的错误信息会写入对应 Standby 数据库的 ALERT.LOG。

为了成功解决日志间隙问题,需要的归档日志应当存在于 FAL_SERVER 数据库(存在于磁盘并且对应的信息同时存在于控制文件中)。

FAL 从 Oracle 9.2.0 开始可用于物理 Standby 数据库,从 Oracle 10.1.0 开始可用于逻辑 Standby 数据库。

 

手工解决日志间隙

如果日志间隙问题不能被上面提到方式解决,那么可以尝试手工解决。

可以通过查询物理 Standby 数据库的 $archive_gap 或者逻辑 Standby 数据库的 dba_logstdby_log 来确定当前的归档日志间隙,例如:

 

物理 Standby 数据库

SQL> select * from v$archive_gap;

逻辑 Standby 数据库

SQL> select thread#, sequence# from dba_logstdby_log l where next_change# not in

(select first_change# from dba_logstdby_log where l.thread#=thread#)

order by thread#, sequence#;

 

现在复制缺失的特定编号的 redo 日志到 Standby 数据库的对应位置。如缺失的日志尚未注册到 Standby 数据库,需要先注册它们才能让 Log Apply Services 处理这些日志文件。可以使用下面的命令注册:

物理 Standby:

SQL> alter database register logfile ‘<File-Specification>’;

逻辑 Standby:

SQL> alter database register logical logfile ‘<File-Specification>’;

 

在它们被注册后,Log Apply Services 就可以处理了。

 

使用增量备份前滚(仅适用于物理 Standby)

如果日志间隙无法被上面提到的方式解决,间隙太大需要太久时间才能解决或者丢失的日志无法被找到,您仍然可以通过使用 SCN 增量备份来前滚物理 Standby 数据库。这个功能从 Oracle 10.2.0 开始可以使用。这个功能通过记录最后应用到 Standby 数据库的SCN,然后使用 RMAN 以及当前控制文件的备份来对主库创建一个从那个 SCN 开始的增量备份。

之后首先用增量备份中的控制文件替换 Standby 的控制文件,之后应用增量备份到 Standby 数据库。这是一个把 Standby 数据库同步到最新的主库的状态的最快最简单的方式。因为采取的步骤各个版本都不同

Oracle dataguard 日志缺失的解决办法

问题时间点,数据库主要为enq: TM – contention等待事件;

节点1:

节点2:

TOP SQL语句的分析

数据库有一条SQL执行时间长,经沟通确认为OGG中添加表附加日志的操作产生的:BEGIN DBMS_CAPTURE_ADM.PREPARE_TABLE_INSTANTIATION(table_name => ‘”CBS”.”NEXF_APPROPRIATE”‘, supplemental_logging => ‘none’); END;

 ASH报告中异常发生的时间段

从ASH性能报告中,可以看到enq: TM – contention等待事件发生在17:20-17:45分左右:

 等待事件对应的对象信息

查看ASH报告,可以发现异常的TM等待对应的表是:CBS.NEXF_APPROPRIATE

对应的SQL语句是:INSERT INTO tran_jrnl VALUES(:id, :fiscal_date, :request_id, :respond_id, :tran_date, :tran_code, :channel_id, :teller_id, :branch_id, :resp_code, :resp_mesg, :status, :original_jrnl_id, :actual_teller_id, :service_name)

CBS.NEXF_APPROPRIATE表与 tran_jrnl 表存在主外键关联关系。

阻塞问题的进一步分析

 enq: TM – contention阻塞原理的分析

通过前面分析,可以发现从ASH报告中找出的TOP SQL对象之间存在主外键约束关系,具体为:

  • CBS用户的NEXF_APPROPRIATE表的REG_JRNL列,使用外键约束,对应的是tran_jrnl表的主键ID列。
  • 外键约束列上未创建索引。

DBMS_CAPTURE_ADM.PREPARE_TABLE_INSTANTIATION与TM等待的关系

OGG执行的操作,在数据库中对应的SQL是BEGIN DBMS_CAPTURE_ADM.PREPARE_TABLE_INSTANTIATION(table_name => ‘”CBS”.”NEXF_APPROPRIATE”‘, supplemental_logging => ‘none’); END;

ORACLE官方文档中(High ‘enq: TM – contention’ Wait Event Causing Session Hangs in an Oracle Streams Environment (Doc ID 740728.1)),对于OGG附加日志时执行的语句:BEGIN DBMS_CAPTURE_ADM.PREPARE_TABLE_INSTANTIATION(table_name => ‘”CBS”.”NEXF_APPROPRIATE”‘, supplemental_logging => ‘none’); END; 的说明,此语句语句本身会引起TM锁;

总结

通过对数据库性能数据的分析,可以发现问题时段数据库出现严重的enq: TM – contention锁等待;结合ASH性能报告中可以对应找出出现异常等待事件的时对象为CBS.NEXF_APPROPRIATE表与 tran_jrnl 表,这两张表存在主外键关联关系;

因此结合ORACLE官方文档中(High ‘enq: TM – contention’ Wait Event Causing Session Hangs in an Oracle Streams Environment (Doc ID 740728.1)),对于OGG附加日志时执行的语句:BEGIN DBMS_CAPTURE_ADM.PREPARE_TABLE_INSTANTIATION(table_name => ‘”CBS”.”NEXF_APPROPRIATE”‘, supplemental_logging => ‘none’); END; 的说明,此语句语句本身会引起TM锁;同时NEXF_APPROPRIATE表与 tran_jrnl 表存在主外键关联关系且外建列上无索引,tran_jrnl 表做为业务流水表会存在较多的INSERT操作,加剧了问题发生时的TM锁争用。

某行业务Oracle数据库卡顿简要分析

概述

 

某客户出现应用程序无法连接,前台业务无法办理等情况,登入后台发现数据库异常关闭

问题及相关日志分析

1. 数据库告警日志分析分析

通过对数据库告警日志查看,显示数据库1节点和2节点实例分别在14:33:59和14:40:41相继关闭。

1节点:

Thu Jan 28 14:40:41 2022

LMS1 (ospid: 80131) received an instance eviction notification from instance 2 [2]

Thu Jan 28 14:40:42 2022

Instance Critical Process (pid: 14, ospid: 80113, LMON) died unexpectedly

PMON (ospid: 80061): terminating the instance due to error 481

Thu Jan 28 14:40:42 2022

System state dump requested by (instance=1, osid=80061 (PMON)), summary=[abnormal instance termination].

System State dumped to trace file /u01/app/oracle/diag/rdbms/cdb/cdb1/trace/cdb1_diag_22810_20220128144042.trc

Thu Jan 28 14:40:44 2022

License high water mark = 2257

Thu Jan 28 14:40:47 2022

Instance terminated by PMON, pid = 80061

Thu Jan 28 14:40:48 2022

USER (ospid: 12966): terminating the instance

 

2节点:

Thu Jan 28 14:33:59 2022

Instance Critical Process (pid: 16, ospid: 91944, LMD1) died unexpectedly

PMON (ospid: 91885): terminating the instance due to error 482

Thu Jan 28 14:33:59 2022

ORA-1092 : opitsk aborting process

Thu Jan 28 14:33:59 2022

System state dump requested by (instance=2, osid=91885 (PMON)), summary=[abnormal instance termination].

System State dumped to trace file /u01/app/oracle/diag/rdbms/cdb/cdb2/trace/cdb2_diag_96171_20220128143359.trc

Thu Jan 28 14:34:01 2022

License high water mark = 1949

Thu Jan 28 14:34:04 2022

Instance terminated by PMON, pid = 91885

查看更多日志,发现在关闭之前,出现了较多的ORA-04031错误。

1节点:

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb1/trace/cdb1_ora_8372.trc  (incident=656535) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 12312 bytes of shared memory (“shared pool”,”unknown object”,”KKSSP^2051″,”kglseshtTable”)

Thu Jan 28 14:38:23 2022

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb1/trace/cdb1_ora_8392.trc  (incident=656543) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 12312 bytes of shared memory (“shared pool”,”unknown object”,”KKSSP^2147″,”kglseshtTable”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Thu Jan 28 14:38:23 2022

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb1/trace/cdb1_p033_97352.trc  (incident=652943) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 12312 bytes of shared memory (“shared pool”,”unknown object”,”KKSSP^3061″,”kglseshtTable”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Thu Jan 28 14:38:23 2022

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb1/trace/cdb1_p02w_90573.trc  (incident=651431) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 12312 bytes of shared memory (“shared pool”,”unknown object”,”KKSSP^3538″,”kglseshtTable”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb1/trace/cdb1_p033_97352.trc  (incident=652944) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate  bytes of shared memory (“”,””,””,””)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb1/trace/cdb1_p02w_90573.trc  (incident=651432) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate  bytes of shared memory (“”,””,””,””)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb1/trace/cdb1_lmon_80113.trc  (incident=668994) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 8504 bytes of shared memory (“shared pool”,”unknown object”,”sga heap(5,0)”,”ges big msg pool”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb1/trace/cdb1_lmon_80113.trc  (incident=668995) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 8504 bytes of shared memory (“shared pool”,”unknown object”,”sga heap(5,0)”,”ges big msg pool”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb1/trace/cdb1_lmon_80113.trc  (incident=669000) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 8504 bytes of shared memory (“shared pool”,”unknown object”,”sga heap(5,0)”,”ges big msg pool”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb1/trace/cdb1_lmon_80113.trc  (incident=669001) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 8504 bytes of shared memory (“shared pool”,”unknown object”,”sga heap(5,0)”,”ges big msg pool”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Thu Jan 28 14:38:31 2022

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb1/trace/cdb1_p032_97348.trc  (incident=652071) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 12312 bytes of shared memory (“shared pool”,”unknown object”,”KKSSP^1124″,”kglseshtTable”)

 

2节点:

ORA-04031: unable to allocate 8504 bytes of shared memory (“shared pool”,”unknown object”,”sga heap(5,0)”,”ges big msg pool”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb2/trace/cdb2_lmd0_91940.trc  (incident=521915) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 8504 bytes of shared memory (“shared pool”,”unknown object”,”sga heap(5,0)”,”ges big msg pool”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb2/trace/cdb2_lmd1_91944.trc  (incident=521926) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 8504 bytes of shared memory (“shared pool”,”unknown object”,”sga heap(5,0)”,”ges big msg pool”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb2/trace/cdb2_lmd1_91944.trc  (incident=521927) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 8504 bytes of shared memory (“shared pool”,”unknown object”,”sga heap(5,0)”,”ges big msg pool”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb2/trace/cdb2_lmd1_91944.trc  (incident=541065) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 8504 bytes of shared memory (“shared pool”,”unknown object”,”sga heap(5,0)”,”ges big msg pool”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Thu Jan 28 14:31:40 2022

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb2/trace/cdb2_ora_77682.trc  (incident=524416) (PDBNAME=YDHL):

ORA-04031: 脦脼路篓路脰脜盲 8504 脳脰陆脷碌脛鹿虏脧铆脛脷麓忙 (“shared pool”,”unknown object”,”sga heap(5,0)”,”ges big msg pool”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb2/trace/cdb2_lmd1_91944.trc  (incident=541066) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 8504 bytes of shared memory (“shared pool”,”unknown object”,”sga heap(5,0)”,”ges big msg pool”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb2/trace/cdb2_rmv1_92030.trc  (incident=522082) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 8504 bytes of shared memory (“shared pool”,”unknown object”,”sga heap(5,0)”,”ges big msg pool”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Thu Jan 28 14:31:48 2022

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb2/trace/cdb2_lck1_91968.trc  (incident=521960) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 8504 bytes of shared memory (“shared pool”,”unknown object”,”sga heap(5,0)”,”ges big msg pool”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Thu Jan 28 14:31:49 2022

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb2/trace/cdb2_ora_90698.trc  (incident=524824) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 12312 bytes of shared memory (“shared pool”,”unknown object”,”KKSSP^2953″,”kglseshtTable”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

2. AWR报告信息分析

 

通过分析得知,数据库服务器物理内存256GB,SGA分配了40G,由于AMM,shared pool已经达到30GB,Buffer Cache只有9G,该数据库内存分配存在严重问题。

 

总结与后续处理建议

1. 问题总结

移动护理数据库内存参数配置存在较严重的问题,物理内存256G,SGA分配40G,shared pool、buffer cache均为自动配置,没有手工分配,因此,需要对内存参数进行调整,增加SGA参数值,手工固定shared pool和buffer cache等值。

3. 处理操作

由于部分内存参数为静态参数,因此修改参数就要停机。我方于2022年1月29日晚上22:00:00开始对移动护理进行内存参数优化的操作,为保证对业务的影响达到最小,此次采用轮流优化和重启1节点和2节点实例的方式。

检查操作系统大页配置:

1节点:

[oracle@node0 ~]$ cat /proc/meminfo |grep HugePage

HugePages_Total:   67200

HugePages_Free:     1378

HugePages_Rsvd:       93

HugePages_Surp:        0

2节点:

[oracle@node1 ~]$ cat /proc/meminfo |grep HugePage

HugePages_Total:   67200

HugePages_Free:     1378

HugePages_Rsvd:       93

HugePages_Surp:        0

 

大页在此之前已经有过预配置,约131G,此次计划配置内存参数为:

SGA 128GB

Shared_pool 20GB

Buffer_Cache 40GB

1节点:

alter system set sga_max_size=128G scope=spfile sid=’cdb1′;

alter system set sga_target=128G scope=spfile sid=’cdb1′;

alter system set shared_pool_size=20G scope=spfile sid=’cdb1′;

alter system set db_cache_size=40G scope=spfile sid=’cdb1′;

 

alter system switch logfile;

alter system checkpoint;

shutdown immediate;

startup;

 

alter pluggable database HEALTH open read write;

alter pluggable database YDHL open read write;

alter pluggable database MANDALA open read write;

alter pluggable database TUTEORCL open read write;

alter pluggable database DOCARE open read write;

alter pluggable database DOCARE5 open read write;

alter pluggable database BAGL open read write;

alter pluggable database EMPI open read write;

alter pluggable database LIS_OLD open read write;

alter pluggable database NCSYS open read write;

alter pluggable database ICU open read write;

alter pluggable database ORCL open read write;

 

2节点:

1节点:

alter system set sga_max_size=128G scope=spfile sid=’cdb2′;

alter system set sga_target=128G scope=spfile sid=’cdb2′;

alter system set shared_pool_size=20G scope=spfile sid=’cdb2′;

alter system set db_cache_size=40G scope=spfile sid=’cdb2′;

 

alter system switch logfile;

alter system checkpoint;

shutdown immediate;

startup;

 

alter pluggable database HEALTH open read write;

alter pluggable database YDHL open read write;

alter pluggable database MANDALA open read write;

alter pluggable database TUTEORCL open read write;

alter pluggable database DOCARE open read write;

alter pluggable database DOCARE5 open read write;

alter pluggable database BAGL open read write;

alter pluggable database EMPI open read write;

alter pluggable database LIS_OLD open read write;

alter pluggable database NCSYS open read write;

alter pluggable database ICU open read write;

alter pluggable database ORCL open read write;

在启动过程中,观察数据库告警日志,确认大页已经生效:

**********************************************************************

Fri Jan 29 22:04:03 2022

Dump of system resources acquired for SHARED GLOBAL AREA (SGA)

 

Fri Jan 29 22:04:03 2022

Per process system memlock (soft) limit = 227G

Fri Jan 29 22:04:03 2022

Expected per process system memlock (soft) limit to lock

SHARED GLOBAL AREA (SGA) into memory: 128G

Fri Jan 29 22:04:03 2022

Available system pagesizes:

4K, 2048K

Fri Jan 29 22:04:03 2022

Supported system pagesize(s):

Fri Jan 29 22:04:03 2022

PAGESIZE  AVAILABLE_PAGES  EXPECTED_PAGES  ALLOCATED_PAGES  ERROR(s)

Fri Jan 29 22:04:03 2022

2048K            66823           65538           65538        NONE

Fri Jan 29 22:04:03 2022

Reason for not supporting certain system pagesizes:

Fri Jan 29 22:04:03 2022

4K – Large pagesizes only

Fri Jan 29 22:04:03 2022

**********************************************************************

为保证后续更好的故障定位和问题跟踪,此次内存参数优化调整过程中,我方还对AWR相关配置进行了优化,调整AWR信息保留至30天,采集频率调整至30分钟。

execute dbms_workload_repository.modify_snapshot_settings(interval=>30,retention=>43200);

 

PL/SQL procedure successfully completed.

 

SQL> select snap_interval,retention from dba_hist_wr_control;

 

SNAP_INTERVAL                       RETENTION

———————————– ———————————–

+00000 00:30:00.0                   +00030 00:00:00.0

Oracle 内存配置不合理导致数据库异常关闭分析一例

概述

 

某客户出现应用程序响应慢,前台业务办理困难等情况。通过远程的方式接入数据库进行故障排除

问题及相关日志分析

1. 数据库等待事件分析

进入数据库中查询得到数据库中存在大量的db file sequential read、gc buffer busy acquire、read by other session等待事件和资源争用

EVENT

—————————————————————-

db file sequential read

gc buffer busy acquire

read by other session

2. AWR报告信息分析

由上图得知,在10:00~11:00,节点2总体负载较高。

 

 

 

数据库中出现了较高的IO类等等待事件,在TOP SQL中,SQL ID为a5ab3mmfmppj6的SQL语句占用了较多的资源,

通过对该SQL的有史以来执行计划查看,该SQL共有3个执行计划:

其中1785204928执行计划为最优执行计划。

 

总结与后续处理建议

1. 问题总结

该SQL语句最近发生了执行计划改变,改变的执行计划效率较差,导致SQL语句执行时间、占用资源等都较高。

3. 处理操作

运行coe_xfr_sql_profile.SQL,将SQL ID为a5ab3mmfmppj6的SQL语句绑定1785204928执行计划

4. 后续建议

通过此次的故障,建议对数据库进行深度的全方位优化,涉及内容包括但不限于如下:

序号 内容 描述
1 业务梳理 对各业务的运行时间、持续时长等做相关梳理
2 内核参数优化 对数据库所在服务器内核参数进行优化
3 数据库参数优化 对数据库PFILE/SPFILE/PROFILE等参数进行优化
4 数据库segment优化 对数据库中表、索引等对象进行优化,如碎片整理、HWM处理等。
5 数据库segment结构优化 对数据库中表、索引等结构进行优化,如表结构优化(heap table、partition table)、索引优化(global、local)等
Oracle SQL 执行计划变更导致性能慢快速分析一例