Skip to content

Oracle 内存配置不合理导致数据库异常关闭分析一例

概述

 

某客户出现应用程序无法连接,前台业务无法办理等情况,登入后台发现数据库异常关闭

问题及相关日志分析

1. 数据库告警日志分析分析

通过对数据库告警日志查看,显示数据库1节点和2节点实例分别在14:33:59和14:40:41相继关闭。

1节点:

Thu Jan 28 14:40:41 2022

LMS1 (ospid: 80131) received an instance eviction notification from instance 2 [2]

Thu Jan 28 14:40:42 2022

Instance Critical Process (pid: 14, ospid: 80113, LMON) died unexpectedly

PMON (ospid: 80061): terminating the instance due to error 481

Thu Jan 28 14:40:42 2022

System state dump requested by (instance=1, osid=80061 (PMON)), summary=[abnormal instance termination].

System State dumped to trace file /u01/app/oracle/diag/rdbms/cdb/cdb1/trace/cdb1_diag_22810_20220128144042.trc

Thu Jan 28 14:40:44 2022

License high water mark = 2257

Thu Jan 28 14:40:47 2022

Instance terminated by PMON, pid = 80061

Thu Jan 28 14:40:48 2022

USER (ospid: 12966): terminating the instance

 

2节点:

Thu Jan 28 14:33:59 2022

Instance Critical Process (pid: 16, ospid: 91944, LMD1) died unexpectedly

PMON (ospid: 91885): terminating the instance due to error 482

Thu Jan 28 14:33:59 2022

ORA-1092 : opitsk aborting process

Thu Jan 28 14:33:59 2022

System state dump requested by (instance=2, osid=91885 (PMON)), summary=[abnormal instance termination].

System State dumped to trace file /u01/app/oracle/diag/rdbms/cdb/cdb2/trace/cdb2_diag_96171_20220128143359.trc

Thu Jan 28 14:34:01 2022

License high water mark = 1949

Thu Jan 28 14:34:04 2022

Instance terminated by PMON, pid = 91885

查看更多日志,发现在关闭之前,出现了较多的ORA-04031错误。

1节点:

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb1/trace/cdb1_ora_8372.trc  (incident=656535) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 12312 bytes of shared memory (“shared pool”,”unknown object”,”KKSSP^2051″,”kglseshtTable”)

Thu Jan 28 14:38:23 2022

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb1/trace/cdb1_ora_8392.trc  (incident=656543) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 12312 bytes of shared memory (“shared pool”,”unknown object”,”KKSSP^2147″,”kglseshtTable”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Thu Jan 28 14:38:23 2022

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb1/trace/cdb1_p033_97352.trc  (incident=652943) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 12312 bytes of shared memory (“shared pool”,”unknown object”,”KKSSP^3061″,”kglseshtTable”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Thu Jan 28 14:38:23 2022

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb1/trace/cdb1_p02w_90573.trc  (incident=651431) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 12312 bytes of shared memory (“shared pool”,”unknown object”,”KKSSP^3538″,”kglseshtTable”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb1/trace/cdb1_p033_97352.trc  (incident=652944) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate  bytes of shared memory (“”,””,””,””)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb1/trace/cdb1_p02w_90573.trc  (incident=651432) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate  bytes of shared memory (“”,””,””,””)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb1/trace/cdb1_lmon_80113.trc  (incident=668994) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 8504 bytes of shared memory (“shared pool”,”unknown object”,”sga heap(5,0)”,”ges big msg pool”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb1/trace/cdb1_lmon_80113.trc  (incident=668995) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 8504 bytes of shared memory (“shared pool”,”unknown object”,”sga heap(5,0)”,”ges big msg pool”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb1/trace/cdb1_lmon_80113.trc  (incident=669000) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 8504 bytes of shared memory (“shared pool”,”unknown object”,”sga heap(5,0)”,”ges big msg pool”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb1/trace/cdb1_lmon_80113.trc  (incident=669001) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 8504 bytes of shared memory (“shared pool”,”unknown object”,”sga heap(5,0)”,”ges big msg pool”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Thu Jan 28 14:38:31 2022

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb1/trace/cdb1_p032_97348.trc  (incident=652071) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 12312 bytes of shared memory (“shared pool”,”unknown object”,”KKSSP^1124″,”kglseshtTable”)

 

2节点:

ORA-04031: unable to allocate 8504 bytes of shared memory (“shared pool”,”unknown object”,”sga heap(5,0)”,”ges big msg pool”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb2/trace/cdb2_lmd0_91940.trc  (incident=521915) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 8504 bytes of shared memory (“shared pool”,”unknown object”,”sga heap(5,0)”,”ges big msg pool”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb2/trace/cdb2_lmd1_91944.trc  (incident=521926) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 8504 bytes of shared memory (“shared pool”,”unknown object”,”sga heap(5,0)”,”ges big msg pool”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb2/trace/cdb2_lmd1_91944.trc  (incident=521927) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 8504 bytes of shared memory (“shared pool”,”unknown object”,”sga heap(5,0)”,”ges big msg pool”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb2/trace/cdb2_lmd1_91944.trc  (incident=541065) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 8504 bytes of shared memory (“shared pool”,”unknown object”,”sga heap(5,0)”,”ges big msg pool”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Thu Jan 28 14:31:40 2022

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb2/trace/cdb2_ora_77682.trc  (incident=524416) (PDBNAME=YDHL):

ORA-04031: 脦脼路篓路脰脜盲 8504 脳脰陆脷碌脛鹿虏脧铆脛脷麓忙 (“shared pool”,”unknown object”,”sga heap(5,0)”,”ges big msg pool”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb2/trace/cdb2_lmd1_91944.trc  (incident=541066) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 8504 bytes of shared memory (“shared pool”,”unknown object”,”sga heap(5,0)”,”ges big msg pool”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb2/trace/cdb2_rmv1_92030.trc  (incident=522082) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 8504 bytes of shared memory (“shared pool”,”unknown object”,”sga heap(5,0)”,”ges big msg pool”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Thu Jan 28 14:31:48 2022

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb2/trace/cdb2_lck1_91968.trc  (incident=521960) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 8504 bytes of shared memory (“shared pool”,”unknown object”,”sga heap(5,0)”,”ges big msg pool”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Thu Jan 28 14:31:49 2022

Errors in file /u01/app/oracle/diag/rdbms/cdb/cdb2/trace/cdb2_ora_90698.trc  (incident=524824) (PDBNAME=CDB$ROOT):

ORA-04031: unable to allocate 12312 bytes of shared memory (“shared pool”,”unknown object”,”KKSSP^2953″,”kglseshtTable”)

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

2. AWR报告信息分析

 

通过分析得知,数据库服务器物理内存256GB,SGA分配了40G,由于AMM,shared pool已经达到30GB,Buffer Cache只有9G,该数据库内存分配存在严重问题。

 

总结与后续处理建议

1. 问题总结

移动护理数据库内存参数配置存在较严重的问题,物理内存256G,SGA分配40G,shared pool、buffer cache均为自动配置,没有手工分配,因此,需要对内存参数进行调整,增加SGA参数值,手工固定shared pool和buffer cache等值。

3. 处理操作

由于部分内存参数为静态参数,因此修改参数就要停机。我方于2022年1月29日晚上22:00:00开始对移动护理进行内存参数优化的操作,为保证对业务的影响达到最小,此次采用轮流优化和重启1节点和2节点实例的方式。

检查操作系统大页配置:

1节点:

[oracle@node0 ~]$ cat /proc/meminfo |grep HugePage

HugePages_Total:   67200

HugePages_Free:     1378

HugePages_Rsvd:       93

HugePages_Surp:        0

2节点:

[oracle@node1 ~]$ cat /proc/meminfo |grep HugePage

HugePages_Total:   67200

HugePages_Free:     1378

HugePages_Rsvd:       93

HugePages_Surp:        0

 

大页在此之前已经有过预配置,约131G,此次计划配置内存参数为:

SGA 128GB

Shared_pool 20GB

Buffer_Cache 40GB

1节点:

alter system set sga_max_size=128G scope=spfile sid=’cdb1′;

alter system set sga_target=128G scope=spfile sid=’cdb1′;

alter system set shared_pool_size=20G scope=spfile sid=’cdb1′;

alter system set db_cache_size=40G scope=spfile sid=’cdb1′;

 

alter system switch logfile;

alter system checkpoint;

shutdown immediate;

startup;

 

alter pluggable database HEALTH open read write;

alter pluggable database YDHL open read write;

alter pluggable database MANDALA open read write;

alter pluggable database TUTEORCL open read write;

alter pluggable database DOCARE open read write;

alter pluggable database DOCARE5 open read write;

alter pluggable database BAGL open read write;

alter pluggable database EMPI open read write;

alter pluggable database LIS_OLD open read write;

alter pluggable database NCSYS open read write;

alter pluggable database ICU open read write;

alter pluggable database ORCL open read write;

 

2节点:

1节点:

alter system set sga_max_size=128G scope=spfile sid=’cdb2′;

alter system set sga_target=128G scope=spfile sid=’cdb2′;

alter system set shared_pool_size=20G scope=spfile sid=’cdb2′;

alter system set db_cache_size=40G scope=spfile sid=’cdb2′;

 

alter system switch logfile;

alter system checkpoint;

shutdown immediate;

startup;

 

alter pluggable database HEALTH open read write;

alter pluggable database YDHL open read write;

alter pluggable database MANDALA open read write;

alter pluggable database TUTEORCL open read write;

alter pluggable database DOCARE open read write;

alter pluggable database DOCARE5 open read write;

alter pluggable database BAGL open read write;

alter pluggable database EMPI open read write;

alter pluggable database LIS_OLD open read write;

alter pluggable database NCSYS open read write;

alter pluggable database ICU open read write;

alter pluggable database ORCL open read write;

在启动过程中,观察数据库告警日志,确认大页已经生效:

**********************************************************************

Fri Jan 29 22:04:03 2022

Dump of system resources acquired for SHARED GLOBAL AREA (SGA)

 

Fri Jan 29 22:04:03 2022

Per process system memlock (soft) limit = 227G

Fri Jan 29 22:04:03 2022

Expected per process system memlock (soft) limit to lock

SHARED GLOBAL AREA (SGA) into memory: 128G

Fri Jan 29 22:04:03 2022

Available system pagesizes:

4K, 2048K

Fri Jan 29 22:04:03 2022

Supported system pagesize(s):

Fri Jan 29 22:04:03 2022

PAGESIZE  AVAILABLE_PAGES  EXPECTED_PAGES  ALLOCATED_PAGES  ERROR(s)

Fri Jan 29 22:04:03 2022

2048K            66823           65538           65538        NONE

Fri Jan 29 22:04:03 2022

Reason for not supporting certain system pagesizes:

Fri Jan 29 22:04:03 2022

4K – Large pagesizes only

Fri Jan 29 22:04:03 2022

**********************************************************************

为保证后续更好的故障定位和问题跟踪,此次内存参数优化调整过程中,我方还对AWR相关配置进行了优化,调整AWR信息保留至30天,采集频率调整至30分钟。

execute dbms_workload_repository.modify_snapshot_settings(interval=>30,retention=>43200);

 

PL/SQL procedure successfully completed.

 

SQL> select snap_interval,retention from dba_hist_wr_control;

 

SNAP_INTERVAL                       RETENTION

———————————– ———————————–

+00000 00:30:00.0                   +00030 00:00:00.0