Skip to content

Oracle

近期遭遇了多次客户数据库无法关闭的问题了,数据库无法关闭,一般是关闭时有事务需要回滚(可能是客户端或者JOB等)或者是关闭时使用错误关闭参数(如忘加参数默认是NORMAL)或者关闭时有其它原因。

因此分享一下关闭数据库步骤的脚本,按照此脚本进行关闭数据库,基本上可以回避无法关闭问题;即使遇到无法关闭,使用kill关闭进程来关闭实例,也很少会造成数据损坏或数据丢失。

近期遇到的几次关闭问题及相应MOS链接如下:

http://blog.csdn.net/haibusuanyun/article/details/50285527
关闭数据库遇到SHUTDOWN: Active processes prevent shutdown operation
http://blog.csdn.net/haibusuanyun/article/details/47073459
关闭数据库时SHUTDOWN: waiting for active calls to complete.处理
供参考MOS文档:
Shutdown Immediate Hangs / Active Processes Prevent Shutdown (文档 ID 416658.1)
Alert Log: Shutdown Waiting for Active Calls to Complete (文档 ID 1039389.6)
What Is The Fastest Way To Cleanly Shutdown An Oracle Database? (文档 ID 386408.1)
—————————————–参考步骤如下:
一、应用停掉

二、停监听器
–对应的LOCAL=NO的进程确认是否需要KILL。
ps -ef|grep LOCAL=NO|wc -l
ps -ef|grep LOCAL=NO
lsnrctl stop

三、查看有没有JOB–如有需要确认是否可停掉
ps -ef|grep ora_j

四、查看有没有事务还未提交的–如有事务未提交确认是否需要等待事务完成
sqlplus ‘/ as sysdba’
select status,instance_name,to_char(startup_time,’yyyy/mm/dd hh24:mi:ss’) starttime from gv$instance;
select open_mode,name from v$database;
是否有大的查询在进行
select count(*) from v$transaction;
select count(*) from v$session_longops where time_remaining>0;
是否有大的事务在进行
select sum(used_ublk) from v$transaction; —如果有大量返回,参考本文最后
select START_TIME,STATUS from v$transaction;
是否有大事务正在回滚/需要在数据库OPEN时进行实例恢复
select * from v$fast_start_transactions;
select * from v$fast_start_servers;

五、手工做检查点与手动多次切换log
–确认以上问题解决完后继续:
先做一次检查点,目的是把脏数据写入数据文件,避免造成数据丢失,减小关闭启动不起来这种现象发生
alter system checkpoint;
切换一次日志:
SQL> alter system switch logfile;

再做一次检查点
alter system checkpoint;
再切换一次日志:
SQL> alter system switch logfile;

六、关机-此时关机速度会正常,有效减少HANG住之类问题。

建议执行关闭数据库命令时,重新打开一个会话窗口,重新登陆数据库进行关闭:
SQL> shutdown immediate;

关闭ORACLE数据库步骤参考

近期遇到一客户数据库关闭时遇到无法关闭的情况。
经检查是关闭前使用了sqlplus窗口直接使用host命名回到SHELL操作界面,然后又sqlplus / as sysdba登陆了数据库(具体切换了三、四次);
在之后的发出关闭命令shutdown immediate;后,命令一直HANG住;此时查看ALERT日志,有“SHUTDOWN: Active processes prevent shutdown operation”提示;
一开始以为是有活动会话没有关闭,查询ps -ef|grep LOCAL 和ps -ef|grep ora_j的进程,均无相应进程;也未配置EM–DB CONSOLE之类。

刚开始以为是遇到有活动会话导致的,数据库版本是AIX6.1+11.2.0.1单实例;
因此根据以往经验会提示出哪个进程导致SHUTDOWN无法完成,根据提示进程号进行KILL进程,就会正常关闭。
——————-
==》根据观察,在ORACLE10G及以上版本,会是如下提示:
Active call for process 12345 user ‘oracle’ program ‘oracle@abcd’
SHUTDOWN: waiting for active calls to complete.
也就是指出哪个进程引起的等待,此时kill -9 12345 进程即可。
==》但是在ORACLE 8I版本中,是不会提示哪个进程ACTIVE引起关闭进程HANG的。

在等待的过程中,耽误了10多分钟。后根据alert提示查询,结合之前工程师的操作;
问题可能是sqlplus时host到os下操作,后又sqlplus / as sysdba登陆,如此反复多次。
——-因为在发出shutdown immediate;的命令行窗口命令仍在执行状态,事实上此时可以使用CTRL+C来取消关闭命令。
本次就使用了CTRL+C来取消关闭命令,后一路exit退出,并重新使用TELNET登陆进行关闭操作,正常完成。
############
这里也是一个重要提醒,在发出shutdown immediate;的命令行窗口,最好新建的会话来执行,避免此类问题。
关库前最好将ps -ef|grep LOCAL 和ps -ef|grep ora_j的进程,是否有大事务在进行及大事务回滚在进行等问题进行确认;
确认之后切换日志、执行检查点,再关闭数据库,此时关库风险会小很多,即使遇到无法关闭去KILL进程,也相应更安全。

#################################故障时
1.无法关闭时的alert日志
Sun Dec 13 00:25:15 2015
Shutting down instance (immediate)
Stopping background process SMCO
Shutting down instance: further logons disabled
Sun Dec 13 00:25:17 2015
Stopping background process CJQ0
Stopping background process QMNC
Stopping background process MMNL
Stopping background process MMON
License high water mark = 78
Stopping Job queue slave processes, flags = 7
Job queue slave processes stopped
All dispatchers and shared servers shutdown
Sun Dec 13 00:30:21 2015
SHUTDOWN: Active processes prevent shutdown operation
Sun Dec 13 00:35:23 2015
SHUTDOWN: Active processes prevent shutdown operation
Sun Dec 13 00:40:24 2015
SHUTDOWN: Active processes prevent shutdown operation
Sun Dec 13 00:45:25 2015
SHUTDOWN: Active processes prevent shutdown operation
Sun Dec 13 00:48:31 2015

Instance shutdown cancelled

 

关闭数据库遇到SHUTDOWN: Active processes prevent shutdown operation

先写结果:

1.验证ifcfg-eth2与ifcfg-eth2bak这种格式的,service network restart 时会读取这种配置文件
如果改为mv ifcfg-eth2bak ifcfg-eth3.bak,则不会被读取。
2.验证了文件名的使用顺序,两个配置文件对应同一网卡时,以最早使用的配置文件的为准。

##########################################测试过程:
1.验证ifcfg-eth2与ifcfg-eth2bak这种格式的,service network restart 时会读取这种配置文件
如果改为mv ifcfg-eth2bak ifcfg-eth3.bak,则不会被读取。
[root@bys1 network-scripts]# mv ifcfg-eth3 ifcfg-eth3bak
[root@bys1 network-scripts]# mv ifcfg-eth2bak ifcfg-eth2.bak
[root@bys1 network-scripts]# service network restart
Shutting down interface eth0: [ OK ]
Shutting down loopback interface: [ OK ]
Bringing up loopback interface: [ OK ]
Bringing up interface eth0: [ OK ]
Bringing up interface eth3bak:
Determining IP information for eth3…^C
—改为ifcfg-eth3.bak,不会使用
[root@bys1 network-scripts]# mv ifcfg-eth3bak ifcfg-eth3.bak
[root@bys1 network-scripts]# service network restart
Shutting down interface eth0: [ OK ]
Shutting down loopback interface: [ OK ]
Bringing up loopback interface: [ OK ]
Bringing up interface eth0: [ OK ]
—-改回去 ifcfg-eth3bak验证会使用
[root@bys1 network-scripts]# mv ifcfg-eth3.bak ifcfg-eth3bak
[root@bys1 network-scripts]# service network restart
Shutting down interface eth0: [ OK ]
Shutting down interface eth3bak: [ OK ]
Shutting down loopback interface: [ OK ]
Bringing up loopback interface: [ OK ]
Bringing up interface eth0: [ OK ]
Bringing up interface eth3bak:
Determining IP information for eth3…^C

####################################
2.测试同一网卡,不同配置文件名时的使用顺序及对应IP生效情况
验证了文件名的使用顺序,两个配置文件对应同一网卡时,以最早使用的配置文件的为准。

配置文件情况
[root@bys1 network-scripts]# cat ifcfg-eth1
# Intel Corporation 82540EM Gigabit Ethernet Controller
DEVICE=eth1
BOOTPROTO=none
ONBOOT=yes
HWADDR=08:00:27:A0:5F:E7
NETMASK=255.255.255.0
IPADDR=192.168.10.1
[root@bys1 network-scripts]# cat ifcfg-eth1bak
# Intel Corporation 82540EM Gigabit Ethernet Controller
DEVICE=eth1
BOOTPROTO=none
ONBOOT=yes
HWADDR=08:00:27:A0:5F:E7
NETMASK=255.255.255.0
IPADDR=192.168.11.1

—实验1;将IP192.168.10.1对应的配置文件先读取,后续eth1bak配置文件不会影响。
[root@bys1 network-scripts]# service network restart
Shutting down interface eth0: [ OK ]
Shutting down interface eth1: [ OK ]
Shutting down loopback interface: [ OK ]
Bringing up loopback interface: [ OK ]
Bringing up interface eth0: [ OK ]
Bringing up interface eth1: [ OK ]
Bringing up interface eth1bak: [ OK ]
[root@bys1 network-scripts]# ifconfig

eth1 Link encap:Ethernet HWaddr 08:00:27:A0:5F:E7
inet addr:192.168.10.1 Bcast:192.168.10.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:104 errors:0 dropped:0 overruns:0 frame:0
TX packets:35 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:15754 (15.3 KiB) TX bytes:5698 (5.5 KiB)

–实验2:将IP192.168.11.1对应的配置文件先读取,会使用此配置文件的信息。
[root@bys1 network-scripts]# mv ifcfg-eth1 ifcfg-eth1b
[root@bys1 network-scripts]# mv ifcfg-eth1bak ifcfg-eth1a
[root@bys1 network-scripts]# service network restart
Shutting down interface eth0: [ OK ]
Shutting down interface eth1a: [ OK ]
Shutting down loopback interface: [ OK ]
Bringing up loopback interface: [ OK ]
Bringing up interface eth0: [ OK ]
Bringing up interface eth1a: [ OK ]
Bringing up interface eth1b: [ OK ]
[root@bys1 network-scripts]# ifconfig eth1
eth1 Link encap:Ethernet HWaddr 08:00:27:A0:5F:E7
inet addr:192.168.11.1 Bcast:192.168.11.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:136 errors:0 dropped:0 overruns:0 frame:0
TX packets:62 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:19100 (18.6 KiB) TX bytes:9287 (9.0 KiB)

 

LINUX6.5 同一网卡多个配置文件时的使用顺序及格式要求

LINUX上安装11.2.0.4 RAC时,安装grid运行root.sh报错ORA-15020 discovered duplicate ASM disk排查。
[root@lnx67 rules.d]# ls -al /dev/asm*
brw-rw—- 1 grid asmadmin 250, 0 Jan 5 14:47 /dev/asm-diskb
brw-rw—- 1 grid asmadmin 250, 16 Jan 5 14:47 /dev/asm-diskc
brw-rw—- 1 grid asmadmin 250, 32 Jan 5 14:47 /dev/asm-diskd
修改UDEV规则,将ASM使用的盘名字改为:
[root@lnx67 Packages]# ls -al /dev/asm*
brw-rw—- 1 grid asmadmin 250, 16 Jan 6 08:24 /dev/asm-diskarch
brw-rw—- 1 grid asmadmin 250, 0 Jan 5 14:47 /dev/asm-diskb
brw-rw—- 1 grid asmadmin 250, 16 Jan 5 14:47 /dev/asm-diskc
brw-rw—- 1 grid asmadmin 250, 32 Jan 5 14:47 /dev/asm-diskd
brw-rw—- 1 grid asmadmin 250, 32 Jan 6 08:24 /dev/asm-diskdata
brw-rw—- 1 grid asmadmin 250, 0 Jan 6 08:45 /dev/asm-diskocr
此时原来的绑定规划生成的盘符没有消失,新的盘符产生了。
之后安装GRID运行root.sh时报错:
ORA-15018: diskgroup cannot be created
ORA-15020: discovered duplicate ASM disk “DGOCR_0000”

——-2016/2/29帮人查问题发现最新信息: 12cR1安装时同样问题,只报错ORA-15018:这一行。。。

原因就是使用ASM_DISKSTRING参数是/dev/asm*,此时比如同一磁盘B对应了/dev/asm-diskb及/dev/asm-diskocr。
从操作系统层面使用start_udev,旧的没产生,最终是重启解决盘符问题。
GRID安装出错后,新安装环境就使用了重新安装的方式。
————root.sh报错信息
[root@lnx67 Packages]# sh /u01/11.2.0/grid/root.sh
Performing root user operation for Oracle 11g

The following environment variables are set as:
ORACLE_OWNER= grid
ORACLE_HOME= /u01/11.2.0/grid

Enter the full pathname of the local bin directory: [/usr/local/bin]:
Copying dbhome to /usr/local/bin …
Copying oraenv to /usr/local/bin …
Copying coraenv to /usr/local/bin …

Creating /etc/oratab file…
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /u01/11.2.0/grid/crs/install/crsconfig_params
Creating trace directory
User ignored Prerequisites during installation
Installing Trace File Analyzer
OLR initialization – successful
root wallet
root wallet cert
root cert export
peer wallet
profile reader wallet
pa wallet
peer wallet keys
pa wallet keys
peer cert request
pa cert request
peer cert
pa cert
peer root cert TP
profile reader root cert TP
pa root cert TP
peer pa cert TP
pa peer cert TP
profile reader pa cert TP
profile reader peer cert TP
peer user cert
pa user cert
Adding Clusterware entries to upstart
CRS-2672: Attempting to start ‘ora.mdnsd’ on ‘lnx67’
CRS-2676: Start of ‘ora.mdnsd’ on ‘lnx67’ succeeded
CRS-2672: Attempting to start ‘ora.gpnpd’ on ‘lnx67’
CRS-2676: Start of ‘ora.gpnpd’ on ‘lnx67’ succeeded
CRS-2672: Attempting to start ‘ora.cssdmonitor’ on ‘lnx67’
CRS-2672: Attempting to start ‘ora.gipcd’ on ‘lnx67’
CRS-2676: Start of ‘ora.cssdmonitor’ on ‘lnx67’ succeeded
CRS-2676: Start of ‘ora.gipcd’ on ‘lnx67’ succeeded
CRS-2672: Attempting to start ‘ora.cssd’ on ‘lnx67’
CRS-2672: Attempting to start ‘ora.diskmon’ on ‘lnx67’
CRS-2676: Start of ‘ora.diskmon’ on ‘lnx67’ succeeded
CRS-2676: Start of ‘ora.cssd’ on ‘lnx67’ succeeded

Disk Group dgocr creation failed with the following message:
ORA-15018: diskgroup cannot be created
ORA-15020: discovered duplicate ASM disk “DGOCR_0000”

Configuration of ASM … failed
see asmca logs at /u01/app/oracle/cfgtoollogs/asmca for details
Did not succssfully configure and start ASM at /u01/11.2.0/grid/crs/install/crsconfig_lib.pm line 6912.
/u01/11.2.0/grid/perl/bin/perl -I/u01/11.2.0/grid/perl/lib -I/u01/11.2.0/grid/crs/install /u01/11.2.0/grid/crs/install/rootcrs.pl execution failed
You have new mail in /var/spool/mail/root

 

安装grid运行root.sh报错ORA-15020 discovered duplicate ASM disk排查

客户反映归档的位置不太多,再仔细看归档的名字也不对;检查下发现指定 归档目录类似
/arch/test 而不是/arch/test/
如下测试验证log_archive_dest_1与log_archive_format对生成的归档文件名的影响。
1.log_archive_dest_1指定的目录后面还有字段

1.1. 字段对应的无同名目录,生成归档文件名=log_archive_dest_1+log_archive_format

1.2 字段对应的有同名目录,生成归档文件在同名目录下,文件名为log_archive_format
2.log_archive_dest_1指定的目录不存在–报错
3.log_archive_dest_1指定的目录
归档文件名=log_archive_format

———————————–
1 log_archive_dest_1指定的目录后面还有字段

1.1. 字段对应的无同名目录,生成归档文件名=log_archive_dest_1+log_archive_format

归档文件名=log_archive_dest_1+log_archive_format
SQL> alter system set log_archive_dest_1=’LOCATION=/u01/oracle/base/oracle/oradata/arch/test’;
System altered.
SQL> archive log list;
Database log mode Archive Mode
Automatic archival Enabled
Archive destination /u01/oracle/base/oracle/oradata/arch/test
Oldest online log sequence 24
Next log sequence to archive 26
Current log sequence 26
SQL> show parameter log_archive_format
NAME TYPE VALUE
———————————— ———– ——————————
log_archive_format string %t_%s_%r.dbf
SQL> alter system switch logfile;
System altered.
——–生成的文件名
[oracle@bys1 arch]$ pwd
/u01/oracle/base/oracle/oradata/arch
[oracle@bys1 arch]$ ls
test1_26_900253622.dbf

1.2 字段对应的有同名目录,生成归档文件在同名目录下,文件名为log_archive_format

文件在同名目录下,文件名为log_archive_format

SQL> alter system switch logfile;
System altered.

[oracle@bys1 test]$ pwd
/u01/oracle/base/oracle/oradata/arch/test
[oracle@bys1 test]$ ls
1_40_900253622.dbf

2.log_archive_dest_1指定的目录不存在–直接报错
SQL> alter system set log_archive_dest_1=’LOCATION=/u01/oracle/base/oracle/oradata/arch/test/’;
alter system set log_archive_dest_1=’LOCATION=/u01/oracle/base/oracle/oradata/arch/test/’
*
ERROR at line 1:
ORA-02097: parameter cannot be modified because specified value is invalid
ORA-16032: parameter LOG_ARCHIVE_DEST_1 destination string cannot be translated
ORA-07286: sksagdi: cannot obtain device information.
Linux Error: 2: No such file or directory

3.log_archive_dest_1指定目录
归档文件名=log_archive_format
[oracle@bys1 arch]$ mkdir test
[oracle@bys1 arch]$ ls
test test1_26_900253622.dbf
SQL> alter system set log_archive_dest_1=’LOCATION=/u01/oracle/base/oracle/oradata/arch/test/’;
System altered.
SQL> archive log list;
Database log mode Archive Mode
Automatic archival Enabled
Archive destination /u01/oracle/base/oracle/oradata/arch/test/
Oldest online log sequence 25
Next log sequence to archive 27
Current log sequence 27
SQL> show parameter log_archive_format
NAME TYPE VALUE
———————————— ———– ——————————
log_archive_format string %t_%s_%r.dbf
SQL> alter system switch logfile;
System altered.
———
[oracle@bys1 arch]$ cd test
[oracle@bys1 test]$ ls -al
total 12
drwxr-xr-x 2 oracle oinstall 4096 Feb 1 16:13 .
drwxr-xr-x 3 oracle oinstall 4096 Feb 1 16:13 ..
-rw-r—– 1 oracle asmadmin 2560 Feb 1 16:13 1_27_900253622.dbf
[oracle@bys1 test]$ pwd
/u01/oracle/base/oracle/oradata/arch/test

 

测试log_archive_dest_1与log_archive_format对生成的归档文件名的影响