Skip to content

未分类 - 9. page

————————–
修改私有ip为其它网段IP。
—————–修改前:
192.168.58.1 bys1-priv.bys.com bys1-priv
192.168.58.2 bys2-priv.bys.com bys2-priv
—————–修改后:
192.168.59.1 bys1-priv.bys.com bys1-priv
192.168.59.2 bys2-priv.bys.com bys2-priv
参考MOS文档:How to Modify Private Network Information in Oracle Clusterware (文档 ID 283684.1)
从11.2 Grid Infrastructure开始,私有网络配置存储在OCR和gpnp配置文件中。 如果专用网络不可用或其定义不正确,CRSD进程将不会启动,并且后续不能对OCR进行更改。 注意,不支持手动修改gpnp配置文件。
——–目录
1.备份gpnp配置文件
2.查看当前集群是运行状态及OS层面网卡信息
3.查看并修改私网配置信息–单个节点进行
4.关闭CRS–两个节点均进行
5.OS层面修改IP并修改/etc/hosts中记录(两个节点均修改)并测通
6.重新启动集群
7.删除原有私网信息并验证
8.检查集群状态
—————————
###########################
—————————
具体步骤:–仅显示节点1,节点2同样步骤。

1.备份gpnp配置文件
$ cd $GRID_HOME/gpnp/<hostname>/profiles/peer/

[grid@bys1 ~]$ cd /u01/11.2.0/grid/gpnp/bys1/profiles/peer/
[grid@bys1 peer]$ ls
pending.xml profile.old profile_orig.xml profile.xml
[grid@bys1 peer]$ cp profile.xml profile.xmlbak

[grid@bys2 ~]$ cd /u01/11.2.0/grid/gpnp/bys2/profiles/peer/
[grid@bys2 peer]$ cp profile.xml profile.xmlbak

2.查看当前集群是运行状态及OS层面网卡信息
[grid@bys2 peer]$ crsctl stat res -t
——————————————————————————–
NAME TARGET STATE SERVER STATE_DETAILS
——————————————————————————–
Local Resources
——————————————————————————–
ora.DATA.dg
ONLINE ONLINE bys1
ONLINE ONLINE bys2
ora.LISTENER.lsnr
ONLINE ONLINE bys1
ONLINE ONLINE bys2
ora.asm
ONLINE ONLINE bys1 Started
ONLINE ONLINE bys2 Started
ora.gsd
OFFLINE OFFLINE bys1
OFFLINE OFFLINE bys2
ora.net1.network
ONLINE ONLINE bys1
ONLINE ONLINE bys2
ora.ons
ONLINE ONLINE bys1
ONLINE ONLINE bys2
——————————————————————————–
Cluster Resources
——————————————————————————–
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE bys1
ora.bys1.vip
1 ONLINE ONLINE bys1
ora.bys2.vip
1 ONLINE ONLINE bys2
ora.bysrac.db
1 ONLINE ONLINE bys1 Open
2 OFFLINE OFFLINE Instance Shutdown
ora.cvu
1 ONLINE ONLINE bys2
ora.oc4j
1 ONLINE ONLINE bys2
ora.scan1.vip
1 ONLINE ONLINE bys1

###############
3.查看并修改私网配置信息–单个节点进行

[grid@bys1 peer]$ oifcfg getif
eth0 192.168.57.0 global public
eth1 192.168.58.0 global cluster_interconnect
[grid@bys1 peer]$ oifcfg iflist
eth0 192.168.57.0
eth1 192.168.58.0
eth1 169.254.0.0

[grid@bys1 peer]$ oifcfg setif -global eth1/192.168.59.0:cluster_interconnect
[grid@bys1 peer]$ oifcfg getif
eth0 192.168.57.0 global public
eth1 192.168.58.0 global cluster_interconnect
eth1 192.168.59.0 global cluster_interconnect

##############
4.关闭CRS–两个节点均进行
[root@bys1 ~]# crsctl stop crs
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on ‘bys1’
………………
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on ‘bys1’ has completed
CRS-4133: Oracle High Availability Services has been stopped.
[root@bys1 ~]# crsctl disable crs
CRS-4621: Oracle High Availability Services autostart is disabled.

###############
5.OS层面修改IP并修改/etc/hosts中记录(两个节点均修改)并测通

—————–修改前:
192.168.58.1 bys1-priv.bys.com bys1-priv
192.168.58.2 bys2-priv.bys.com bys2-priv
—————–修改后:
192.168.59.1 bys1-priv.bys.com bys1-priv
192.168.59.2 bys2-priv.bys.com bys2-priv

[root@bys2 network-scripts]# ifconfig

eth1 Link encap:Ethernet HWaddr 08:00:27:4B:EE:4E
inet addr:192.168.59.2 Bcast:192.168.59.255 Mask:255.255.255.0
inet6 addr: fe80::a00:27ff:fe4b:ee4e/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:13462 errors:0 dropped:0 overruns:0 frame:0
TX packets:15323 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:6890589 (6.5 MiB) TX bytes:11488864 (10.9 MiB)
[root@bys2 network-scripts]# ping bys1-priv
PING bys1-priv.bys.com (192.168.59.1) 56(84) bytes of data.
64 bytes from bys1-priv.bys.com (192.168.59.1): icmp_seq=1 ttl=64 time=2.85 ms
###############
6.重新启动集群

[root@bys2 ~]# crsctl enable crs
CRS-4622: Oracle High Availability Services autostart is enabled.
[root@bys2 ~]# crsctl start crs
CRS-4123: Oracle High Availability Services has been started.

###############
7.删除原有私网信息并验证
[grid@bys1 ~]$ oifcfg getif
eth0 192.168.57.0 global public
eth1 192.168.58.0 global cluster_interconnect
eth1 192.168.59.0 global cluster_interconnect
[grid@bys1 ~]$ oifcfg delif -global eth1/192.168.58.0:cluster_interconnect
[grid@bys1 ~]$ oifcfg getif
eth0 192.168.57.0 global public
eth1 192.168.59.0 global cluster_interconnect

###############
8.检查集群状态
[grid@bys1 ~]$ gpnptool get
Warning: some command line parameters were defaulted. Resulting command line:
/u01/11.2.0/grid/bin/gpnptool.bin get -o-

<?xml version=”1.0″ encoding=”UTF-8″?><gpnp:GPnP-Profile Version=”1.0″ xmlns=”http://www.grid-pnp.org/2005/11/gpnp-profile” xmlns:gpnp=”http://www.grid-pnp.org/2005/11/gpnp-profile” xmlns:orcl=”http://www.oracle.com/gpnp/2005/11/gpnp-profile” xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance” xsi:schemaLocation=”http://www.grid-pnp.org/2005/11/gpnp-profile gpnp-profile.xsd” ProfileSequence=”8″ ClusterUId=”405460e2b8c24fd8bf9acebf33654b8b” ClusterName=”bysrac” PALocation=””><gpnp:Network-Profile><gpnp:HostNetwork id=”gen” HostName=”*”><gpnp:Network id=”net1″ IP=”192.168.57.0″ Adapter=”eth0″ Use=”public”/><gpnp:Network id=”net4″ Adapter=”eth1″ IP=”192.168.59.0″ Use=”cluster_interconnect”/></gpnp:HostNetwork></gpnp:Network-Profile><orcl:CSS-Profile id=”css” DiscoveryString=”+asm” LeaseDuration=”400″/><orcl:ASM-Profile id=”asm” DiscoveryString=”/dev/asm*” SPFile=”+DATA/bysrac/asmparameterfile/registry.253.927488691″/><ds:Signature xmlns:ds=”http://www.w3.org/2000/09/xmldsig#”><ds:SignedInfo><ds:CanonicalizationMethod Algorithm=”http://www.w3.org/2001/10/xml-exc-c14n#”/><ds:SignatureMethod Algorithm=”http://www.w3.org/2000/09/xmldsig#rsa-sha1″/><ds:Reference URI=””><ds:Transforms><ds:Transform Algorithm=”http://www.w3.org/2000/09/xmldsig#enveloped-signature”/><ds:Transform Algorithm=”http://www.w3.org/2001/10/xml-exc-c14n#”> <InclusiveNamespaces xmlns=”http://www.w3.org/2001/10/xml-exc-c14n#” PrefixList=”gpnp orcl xsi”/></ds:Transform></ds:Transforms><ds:DigestMethod Algorithm=”http://www.w3.org/2000/09/xmldsig#sha1″/><ds:DigestValue>TQTgFHyy0z8ROUI6tfleTxsQwdY=</ds:DigestValue></ds:Reference></ds:SignedInfo><ds:SignatureValue>F3N8CS7gBNJXMfMHZuP/FckLtynkybtNFq3TbHuU9yrZEuDuzMA1EtMmId7W7YPDS1wBZ6Qrh4hKMDuXfWTSR6xHZQ9iFM6mC6vHTa13+7AGYopoat5iXnGd050jj/w/VMhiYUQuP5g5O28SUu6lHRlhzpnPZLUkyhKvmhRdpIM=</ds:SignatureValue></ds:Signature></gpnp:GPnP-Profile>
Success.

[grid@bys1 ~]$ ifconfig
eth1 Link encap:Ethernet HWaddr 08:00:27:64:B0:1C
inet addr:192.168.59.1 Bcast:192.168.59.255 Mask:255.255.255.0
inet6 addr: fe80::a00:27ff:fe64:b01c/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:21137 errors:0 dropped:0 overruns:0 frame:0
TX packets:18486 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:14069238 (13.4 MiB) TX bytes:11326935 (10.8 MiB)

eth1:1 Link encap:Ethernet HWaddr 08:00:27:64:B0:1C
inet addr:169.254.167.252 Bcast:169.254.255.255 Mask:255.255.0.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

 

11gr2 rac改IP系列之四:修改私网IP为其它IP

1.未配置NTP,且移除/etc/ntp.conf配置文件
2016-12-10 22:39:17.459: [ CTSS][156329728]ctss_main: The Cluster Time Synchronization Service is started with option [reboot].
2016-12-10 22:39:17.459: [ CTSS][156329728]ctss_scls_init: SCLs Context is 0x19bd3b0
[ clsdmt][149878528]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=bys1DBG_CTSSD))
2016-12-10 22:39:17.469: [ clsdmt][149878528]PID for the Process [2460], connkey 11
2016-12-10 22:39:17.470: [ clsdmt][149878528]Creating PID [2460] file for home /u01/11.2.0/grid host bys1 bin ctss to /u01/11.2.0/gri
d/ctss/init/
2016-12-10 22:39:17.470: [ clsdmt][149878528]Writing PID [2460] to the file [/u01/11.2.0/grid/ctss/init/bys1.pid]
2016-12-10 22:39:18.193: [ CTSS][149878528]ctss_checkcb: clsdm requested check alive. checkcb_data{mode[0x40], offset[0 ms]}, lengt
h=[8].
2016-12-10 22:39:18.206: [ CTSS][156329728]ctss_css_init: CSS Context is 0x19ce210
2016-12-10 22:39:18.206: [ CTSS][156329728]ctss_init: CTSS production mode
2016-12-10 22:39:18.206: [ CTSS][156329728]ctss_init: Env var CTSS_REBOOT is undefined or contains non-boolean value. Ignoring CTSS_REBOOT.
—-以上可以看到集群启动时CTSSD启动,及相关的PID信息
2016-12-10 22:39:18.206: [ CTSS][156329728]sclsctss_gvss2: NTP default pid file not found
2016-12-10 22:39:18.206: [ CTSS][156329728]sclsctss_gvss8: Return [0] and NTP status [1].
2016-12-10 22:39:18.206: [ CTSS][156329728]ctss_check_vendor_sw: Vendor time sync software is not detected. status [1].
2016-12-10 22:39:18.206: [ GIPC][156329728] gipcCheckInitialization: possible incompatible non-threaded init from [prom.c : 694], original from [clsss.c : 5358]
——- 没有检测到NTP及其它时间同步软件,CTSSD为ACTIVE模式(结合下面配置ntp.conf时的日志判断)
2016-12-10 22:39:18.210: [ default][156329728]clsvactversion:4: Retrieving Active Version from local storage.
2016-12-10 22:39:18.211: [ CTSS][156329728]clsctss_r_av4: Current active version [11.2.0.4.0] [186647552].
2016-12-10 22:39:18.213: [ CRSCCL][156329728]clsCclInit called by process 2460: groupname=CTSSGROUP commOptions=0 clusterType=0
2016-12-10 22:39:18.213: [ CRSCCL][156329728]Software version: 11.2.0.4.0.
2016-12-10 22:39:18.214: [ OCRMSG][156329728]prom_waitconnect: CONN NOT ESTABLISHED (0,29,1,2)

##########################################
2.未配置NTP,但是存在/etc/ntp.conf配置文件

2016-12-10 23:03:38.020: [ CTSS][3066242816]ctss_main: The Cluster Time Synchronization Service is started with option [reboot].
2016-12-10 23:03:38.020: [ CTSS][3066242816]ctss_scls_init: SCLs Context is 0x11363b0
[ clsdmt][3059791616]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=bys1DBG_CTSSD))
2016-12-10 23:03:38.026: [ clsdmt][3059791616]PID for the Process [10797], connkey 11
2016-12-10 23:03:38.026: [ clsdmt][3059791616]Creating PID [10797] file for home /u01/11.2.0/grid host bys1 bin ctss to /u01/11.2.0/grid/ctss/init/
2016-12-10 23:03:38.027: [ clsdmt][3059791616]Writing PID [10797] to the file [/u01/11.2.0/grid/ctss/init/bys1.pid]
2016-12-10 23:03:38.915: [ CTSS][3059791616]ctss_checkcb: clsdm requested check alive. checkcb_data{mode[0x40], offset[0 ms]}, length=[8].
2016-12-10 23:03:38.918: [ CTSS][3066242816]ctss_css_init: CSS Context is 0x1147210
2016-12-10 23:03:38.918: [ CTSS][3066242816]ctss_init: CTSS production mode
2016-12-10 23:03:38.919: [ CTSS][3066242816]ctss_init: Env var CTSS_REBOOT is undefined or contains non-boolean value. Ignoring CTSS_REBOOT.

—-以上可以看到集群启动时CTSSD启动,及相关的PID信息

2016-12-10 23:03:38.919: [ CTSS][3066242816]sclsctss_gvss1: NTP default config file found —-发现了NTP的配置文件
2016-12-10 23:03:38.919: [ CTSS][3066242816]sclsctss_gvss8: Return [0] and NTP status [2].
2016-12-10 23:03:38.919: [ CTSS][3066242816]ctss_check_vendor_sw: Vendor time sync software is detected. status [2].
2016-12-10 23:03:38.919: [ CTSS][3066242816]ctss_check_vendor_sw: Ctssd is switching to observer role —-CTSSD切换为observer观察者模式

2016-12-10 23:03:38.920: [ GIPC][3066242816] gipcCheckInitialization: possible incompatible non-threaded init from [prom.c : 694], original from [clsss.c : 5358]
2016-12-10 23:03:38.921: [ default][3066242816]clsvactversion:4: Retrieving Active Version from local storage.
2016-12-10 23:03:38.923: [ CTSS][3066242816]clsctss_r_av4: Current active version [11.2.0.4.0] [186647552].
2016-12-10 23:03:38.924: [ CRSCCL][3066242816]clsCclInit called by process 10797: groupname=CTSSGROUP commOptions=0 clusterType=0
2016-12-10 23:03:38.925: [ CRSCCL][3066242816]Software version: 11.2.0.4.0.
2016-12-10 23:03:38.925: [ OCRMSG][3066242816]prom_waitconnect: CONN NOT ESTABLISHED (0,29,1,2)
2016-12-10 23:03:38.926: [ OCRMSG][3066242816]GIPC error [29] msg [gipcretConnectionRefused]
2016-12-10 23:03:38.926: [ OCRMSG][3066242816]prom_connect: error while waiting for connection complete [24]
2016-12-10 23:03:38.926: [ CRSCCL][3066242816]Failed to init OCR to get active version. PROC-32: Cluster Ready Services on the local node is not running Messaging error [gipcretConnectionRefused] [29]Checking active version in OLR.
2016-12-10 23:03:38.928: [ default][3066242816]clsvactversion:4: Retrieving Active Version from local storage.
2016-12-10 23:03:38.930: [ CRSCCL][3066242816]Active version: 11.2.0.4.0.
2016-12-10 23:03:38.931: [ CRSCCL][3066242816]USING GIPC ============
2016-12-10 23:03:38.931: [ CRSCCL][3066242816]clsCclGipcListen: Attempting to listen on gipcha://bys1:CTSSGROUP_1.
2016-12-10 23:03:38.932: [GIPCHGEN][3066242816] gipchaInternalRegister: Initializing HA GIPC
2016-12-10 23:03:38.932: [GIPCHGEN][3066242816] gipchaNodeCreate: adding new node 0x1271b40 { host ”, haName ‘981d-9f80-3574-cb76’, srcLuid 2b8bd76d-00000000, dstLuid 00000000-00000000 numInf 0, contigSeq 0, lastAck 0, lastValidAck 0, sendSeq [0 : 0], createTime 1227934, sentRegister 0, localMonitor 0, flags 0x1 }
2016-12-10 23:03:38.932: [GIPCHTHR][3051460352] gipchaWorkerThread: starting worker thread hctx 0x125d7e0 [0000000000000010] { gipchaContext : host ‘bys1’, name ‘981d-9f80-3574-cb76’, luid ‘2b8bd76d-00000000’, numNode 0, numInf 0, usrFlags 0x0, flags 0xc000 }
2016-12-10 23:03:38.933: [GIPCHDEM][3049359104] gipchaDaemonThread: starting daemon thread hctx 0x125d7e0 [0000000000000010] { gipchaContext : host ‘bys1’, name ‘981d-9f80-3574-cb76’, luid ‘2b8bd76d-00000000’, numNode 0, numInf 0, usrFlags 0x0, flags 0xc000 }
2016-12-10 23:03:38.958: [GIPCHGEN][3049359104] gipchaNodeAddInterface: adding interface information for inf 0x7faaa400c0c0 { host ”, haName ‘981d-9f80-3574-cb76’, local (nil), ip ‘192.168.59.1’, subnet ‘192.168.59.0’, mask ‘255.255.255.0’, mac ’08-00-27-64-b0-1c’, ifname ‘eth1’, numRef 0, numFail 0, idxBoot 0, flags 0x1 }
2016-12-10 23:03:39.160: [GIPCXCPT][3066242816] gipchaInternalResolve: failed to resolve ret gipcretKeyNotFound (36), host ‘bys1’, port ‘CTSSGROUP_1’, hctx 0x125d7e0 [0000000000000010] { gipchaContext : host ‘bys1’, name ‘981d-9f80-3574-cb76’, luid ‘2b8bd76d-00000000’, numNode 0, numInf 1, usrFlags 0x0, flags 0x5 }, ret gipcretKeyNotFound (36)
2016-12-10 23:03:39.160: [GIPCHGEN][3066242816] gipchaResolveF [gipcmodGipcResolve : gipcmodGipc.c : 809]: EXCEPTION[ ret gipcretKeyNotFound (36) ] failed to resolve ctx 0x125d7e0 [0000000000000010] { gipchaContext : host ‘bys1’, name ‘981d-9f80-3574-cb76’, luid ‘2b8bd76d-00000000’, numNode 0, numInf 1, usrFlags 0x0, flags 0x5 }, host ‘bys1’, port ‘CTSSGROUP_1’, flags 0x0
2016-12-10 23:03:39.161: [GIPCHTHR][3051460352] gipchaWorkerCreateInterface: created local interface for node ‘bys1’, haName ‘981d-9f80-3574-cb76’, inf ‘udp://192.168.59.1:63850’
2016-12-10 23:03:39.162: [ CRSCCL][3066242816]gipcListen() Listening on gipcha://bys1:CTSSGROUP_1
2016-12-10 23:03:39.164: [ CRSCCL][3047257856]CSS Group Registration complete.

2016-12-10 23:03:39.164: [ CRSCCL][3047257856]cclGetMemberData called
2016-12-10 23:03:39.165: [ CRSCCL][3047257856]Member (1, 1228164, bys1:11.2.0.4.0) @ found.

2016-12-10 23:03:39.166: [ CRSCCL][3047257856]Obtained first membership map.

2016-12-10 23:03:39.166: [ CRSCCL][3047257856]Dumping member data ——————
2016-12-10 23:03:39.166: [ CRSCCL][3047257856]Member (1, 1228164) on node bys1 port=.
2016-12-10 23:03:39.167: [ CRSCCL][3047257856]Done ——————
2016-12-10 23:03:39.167: [ CTSS][3066242816]ctss_ccl_init4: clsCclInitWithCtx() finished. The local nodenum is [1].
2016-12-10 23:03:39.167: [ CTSS][3066242816]ctss_ccl_init6: Retrieved grpmap.
2016-12-10 23:03:39.168: [ CTSS][3066242816]ctss_ccl_init99: Successfully initialize CCL
2016-12-10 23:03:39.168: [ CTSS][3066242816]ctss_init: Spawn completed. Waiting for threads to join
2016-12-10 23:03:39.168: [ CTSS][3047257856]ctsselect_msccb1: Receive membership change event. Inc num[1] New master [1] members count[1]
2016-12-10 23:03:39.168: [ CTSS][3047257856]ctsselect_msccb9: The local node [1] is the CTSS master
2016-12-10 23:03:39.168: [ CRSCCL][3047257856]clsCclGetPriMemData: memDataSize[16] is too small. Requires [256]. Returns [14]
2016-12-10 23:03:39.168: [ CTSS][3047257856]ctsselect_gpd1: Size of pridata for node [1] is [256]. Passed [16].
2016-12-10 23:03:39.169: [ CTSS][3047257856](:ctss_e_rmmsr_2_1:: Failed to retrieve peer member data [14]. Need to alloc bigger buffer [16].
2016-12-10 23:03:39.169: [ CTSS][3047257856](:ctss_e_rmmsr_4:): Pri data for member [1]. {Version [1] Node [-1] SW version [186647552] Mode [0x62]}
2016-12-10 23:03:39.169: [ CTSS][3047257856](:ctss_e_rmmsr_5:): Detected vendor time sync sw on peer [1].
2016-12-10 23:03:39.169: [ CTSS][3047257856](:ctss_e_rmmsr_9:): Return [0]
2016-12-10 23:03:39.169: [ CRSCCL][3047257856]Waiting for reconfigs
2016-12-10 23:03:39.170: [ CRSCCL][3047257856]clsCclGetPriMemberData: Detected pridata change for node[1]. Retrieving it to the cache.
[ CTSS][3038471936]ctsselect_msm: Slave Monitor thread started
2016-12-10 23:03:39.170: [ CTSS][3038471936]ctsselect_msm: CTSS mode is [0xee]
[ CTSS][3040573184]clsctsselect_mm: Master Monitor thread started
2016-12-10 23:03:39.172: [ CTSS][3036370688]ctsselect_vermon5: Successfully registered with [crs_version]
2016-12-10 23:03:39.172: [ CTSS][3036370688]ctsselect_vermon7: Expecting clssgsevGRPPRIV event. Ignoring 1 event.
2016-12-10 23:03:39.172: [ CRSCCL][3044775680]cclCommunicationHandler started.
2016-12-10 23:03:39.923: [ CTSS][3059791616]ctss_checkcb: clsdm requested check alive. checkcb_data{mode[0xee], offset[0 ms]}, length=[8].
2016-12-10 23:03:39.927: [ CTSS][3059791616]ctss_checkcb: clsdm requested check alive. checkcb_data{mode[0xee], offset[0 ms]}, length=[8].
2016-12-10 23:03:40.921: [ CTSS][3059791616]ctss_checkcb: clsdm requested check alive. checkcb_data{mode[0xee], offset[0 ms]}, length=[8].
2016-12-10 23:03:40.924: [ CTSS][3059791616]ctss_checkcb: clsdm requested check alive. checkcb_data{mode[0xee], offset[0 ms]}, length=[8].
2016-12-10 23:03:40.931: [ CTSS][3059791616]ctss_checkcb: clsdm requested check alive. checkcb_data{mode[0xee], offset[0 ms]}, length=[8].
2016-12-10 23:03:40.934: [ CTSS][3059791616]ctss_checkcb: clsdm requested check alive. checkcb_data{mode[0xee], offset[0 ms]}, length=[8].
2016-12-10 23:04:04.174: [ CTSS][3036370688]ctsselect_vermon7: Expecting clssgsevGRPPRIV event. Ignoring 1 event.
2016-12-10 23:04:04.174: [ CTSS][3036370688]ctsselect_vermon8: Received clssgsevGRPPRIV event.
2016-12-10 23:04:04.175: [ CTSS][3036370688]ctsselect_vermon10_1: Retrieved av_data from grp pridata. Upgrade state [11].
2016-12-10 23:04:04.175: [ CTSS][3036370688]ctsselect_vermon11: Retrieved Active Version [186647552].
2016-12-10 23:04:04.175: [ CTSS][3036370688]ctsselect_vermon12: Active version[186647552] didn’t change.
2016-12-10 23:04:09.171: [ CTSS][3040573184]sclsctss_gvss1: NTP default config file found
2016-12-10 23:04:09.171: [ CTSS][3040573184]sclsctss_gvss8: Return [0] and NTP status [2].
2016-12-10 23:04:09.171: [ CTSS][3040573184]ctss_check_vendor_sw: Vendor time sync software is detected. status [2].
2016-12-10 23:04:09.926: [ CTSS][3059791616]ctss_checkcb: clsdm requested check alive. checkcb_data{mode[0xee], offset[0 ms]}, length=[8].

11gr2通过配置ntp.conf来观察ctssd.log中CTSSD模式为ACTIVE或者observer

测试将GRID_HOME下所有文件属组改变为ORACLE用户的,集群出现异常后的修复方式。
参考MOS文档:Script to capture and restore file permission in a directory (for eg. ORACLE_HOME) (文档 ID 1515018.1)
测试环境:LINUX-x64+oracle11gR2两节点RAC
1.测试,修改节点1 GRID_HOME中所有文件权限为oracle:oinstall
[root@bys1 app]# cd /u01/11.2.0/grid/
[root@bys1 grid]# chown -R oracle:oinstall ./*

2.在正常节点2上获取目录及文件的正确权限
[root@bys2 ~]# ls
anaconda-ks.cfg Documents install.log Music Pictures Templates
Desktop Downloads install.log.syslog permission.pl Public Videos
[root@bys2 ~]# chmod a+x permission.pl
[root@bys2 ~]# ./permission.pl /u01/11.2.0/grid/
Following log files are generated
logfile : permission-Wed-Dec-14-16-16-50-2016
Command file : restore-perm-Wed-Dec-14-16-16-50-2016.cmd
Linecount : 17253

3.使用生成的脚本对权限进行恢复
[root@bys1 ~]# chmod a+x restore-perm-Wed-Dec-14-16-16-50-2016.cmd
[root@bys1 ~]# ./restore-perm-Wed-Dec-14-16-16-50-2016.cmd >/tmp/chown.log
—从如下输出来看,主要是一些日志文件不存在引起的对应CHOWN语句出错;
—注意:1.olr在安装完成时的自动备份文件权限需要手动配置
—注意:2.OCR自动备份权限需要手动配置/u01/11.2.0/grid/cdata/bysrac–权限不对会导致无法覆盖
—注意:3.注意检验GRID/ORACLE 的home下bin目录中oracle程序的权限–6751e及属组

–部分重复的输出已经删除–
chown: cannot access `/u01/11.2.0/grid/cdata/bys1/backup_20161109_205004.olr’: No such file or directory
chmod: cannot access `/u01/11.2.0/grid/cdata/bys1/backup_20161109_205004.olr’: No such file or directory
chown: cannot access `/u01/11.2.0/grid/cfgtoollogs/opatch/lsinv/lsinventory2016-11-24_15-50-32PM.txt’: No such file or directory
chmod: cannot access `/u01/11.2.0/grid/cfgtoollogs/oui/oraInstall2016-11-09_09-01-09PM.out.bys1′: No such file or directory
chown: cannot access `/u01/11.2.0/grid/log/bys1/client/clsc_7.log’: No such file or directory
chown: cannot access `/u01/11.2.0/grid/log/bys1/client/gpnptool_13875.log’: No such file or directory
chmod: cannot access `/u01/11.2.0/grid/log/bys1/client/ocrcheck_18006.log’: No such file or directory
chmod: cannot access `/u01/11.2.0/grid/log/bys1/client/clsc_9.log’: No such file or directory
chown: cannot access `/u01/11.2.0/grid/log/bys1/client/gpnptool_12955.log’: No such file or directory
chown: cannot access `/u01/11.2.0/grid/log/bys1/client/clsc_14.log’: No such file or directory
chown: cannot access `/u01/11.2.0/grid/log/bys1/client/ocrconfig_14761.log’: No such file or directory
chown: cannot access `/u01/11.2.0/grid/cv/init/start.txt’: No such file or directory
chmod: cannot access `/u01/11.2.0/grid/install/root_bys1.bys.com_2016-11-09_20-46-04.log’: No such file or directory
chown: cannot access `/u01/11.2.0/grid/crf/db/bys1/26-NOV-2016-09:36:58.txt’: No such file or directory
chmod: cannot access `/u01/11.2.0/grid/crf/db/bys1/10-NOV-2016-13:30:24.txt’: No such file or directory
chown: cannot access `/u01/11.2.0/grid/crf/db/bys1/log.0000000018′: No such file or directory
chown: cannot access `/u01/11.2.0/grid/crf/db/bys1/26-NOV-2016-09:33:56.txt’: No such file or directory
chown: cannot access `/u01/11.2.0/grid/crf/db/bys1/log.0000000017′: No such file or directory
chown: cannot access `/u01/11.2.0/grid/crf/db/bys1/10-NOV-2016-12:07:04.txt’: No such file or directory
chmod: cannot access `/u01/11.2.0/grid/crf/db/bys1/29-NOV-2016-18:27:55.txt’: No such file or directory
chown: cannot access `/u01/11.2.0/grid/oc4j/j2ee/home/log/oc4j_2016_11_29_18_05_27.err’: No such file or directory
chown: cannot access `/u01/11.2.0/grid/oc4j/j2ee/home/log/oc4j_2016_11_27_21_25_38.out’: No such file or directory
chown: cannot access `/u01/11.2.0/grid/oc4j/j2ee/home/persistence/MBeanServerEjb.ser’: No such file or directory
chown: cannot access `/u01/11.2.0/grid/.patch_storage/NApply/2016-11-10_12-54-39PM’: No such file or directory
chown: cannot access `/u01/11.2.0/grid/rdbms/log/+asm2_ora_2644.trc’: No such file or directory
大量此类文件
chmod: cannot access `/u01/11.2.0/grid/rdbms/log/+asm2_ora_2478.trc’: No such file or directory
chmod: cannot access `/u01/11.2.0/grid/rdbms/audit/+ASM2_ora_10622_20161109214606892302143795.aud’: No such file or directory
大量此类文件
chmod: cannot access `/u01/11.2.0/grid/rdbms/audit/+ASM2_ora_15864_20161109210021310719143795.aud’: No such file or directory
chown: cannot access `/u01/11.2.0/grid/dbs/ab_+ASM2.dat’: No such file or directory

[root@bys1 ~]#

4.重启主机或者集群,检查集群状态,集群可以恢复正常;
[root@bys1 ~]# crsctl stat res -t
——————————————————————————–
NAME TARGET STATE SERVER STATE_DETAILS
——————————————————————————–
Local Resources
——————————————————————————–
ora.DATA.dg
ONLINE ONLINE bys1
ONLINE ONLINE bys2
ora.LISTENER.lsnr
ONLINE ONLINE bys1
ONLINE ONLINE bys2
ora.asm
ONLINE ONLINE bys1 Started
ONLINE ONLINE bys2 Started
ora.gsd
OFFLINE OFFLINE bys1
OFFLINE OFFLINE bys2
ora.net1.network
ONLINE ONLINE bys1
ONLINE ONLINE bys2
ora.ons
ONLINE ONLINE bys1
ONLINE ONLINE bys2
——————————————————————————–
Cluster Resources
——————————————————————————–
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE bys2
ora.bys1.vip
1 ONLINE ONLINE bys1
ora.bys2.vip
1 ONLINE ONLINE bys2
ora.bysrac.db
1 ONLINE ONLINE bys1 Open
2 ONLINE ONLINE bys2 Open
ora.cvu
1 ONLINE ONLINE bys2
ora.oc4j
1 OFFLINE OFFLINE
ora.scan1.vip
1 ONLINE ONLINE bys2

测试将RAC GRID_HOME下所有文件属组修改后的修复方式permission.pl

RAC版本为:12.1.0.2.161018,使用ASM;恢复测试时,恢复到单实例非ASM环境时;
在执行restore database命令时,alert日志中一直报错:WARNING: failed to start ASMB (connection failed) state=0x1 sid=”,RMAN中restore database执行到分配通道后,也无法继续;
此时查询v$session_longops视图也查不到会话信息;

命令类似如下:
run
set newname for datafile 56 to ‘/oradata/**db/data/indx03.dbf’;
restore database;
switch datafile all;
}
输出到如下时停止:
allocated channel: ORA_DISK_4
channel ORA_DISK_4: SID=188 device type=DISK

alert日志中的信息如下:

————–
根据ALERT中信息,查下mos,匹配如下bug:
WARNING: failed to start ASMB after RAC Database on ASM converted to Single Instance Non-ASM Database (文档 ID 2138520.1)
12c RMAN Operations from ASM To Non-ASM Slow (文档 ID 2081537.1)
BUG 19503821: RMAN CATALOG EXTREMELY SLOW WHEN MIGRATING DATABASE FROM ASM TO FILE SYSTEM
———
解决:
参照MOS建议,打了补丁19503821之后,restore database可以正常恢复完成。
–最终alert日志中还有其它一些warning,未影响此次恢复,不折腾了;感觉12cR1还是有不少bug,慎用吧~
——2016/12/30遇到的问题,记录于2016/12/31 22:50分,银行结算保障加班中~~

RAC12.1.0.2.161018PSU从RAC+ASM恢复到单实例非ASM遇到的BUG

被同事指出备份脚本中缺少手动切换日志的命令,事实上在10G及以上版本已经不需要此在脚本中加上此语句。主要通过查阅官方文档及实验,验证10G/11G/12cr1版本中backup archivelog命令是否会触发归档current logfile操作。

结果如下;
如果数据库在OPEN状态,运行BACKUP ARCHIVELOG命令时,如果不使用UNTIL/SEQUENCE关键字,会自动执行日志切换命令。

参考官方文档中描述:http://docs.oracle.com/cd/E11882_01/backup.112/e10643/rcmsynta007.htm#CHDCFGEI
http://docs.oracle.com/database/121/RCMRF/rcmsynta006.htm

If the database is open when you run BACKUP ARCHIVELOG, and if the UNTIL clause or SEQUENCE parameter is not specified, then RMAN runs ALTER SYSTEM ARCHIVE LOG CURRENT. —这一句,如果数据库在OPEN状态,运行BACKUP ARCHIVELOG命令时,如果不使用UNTIL/SEQUENCE关键字,会执行日志切换命令。

Note: If you run BACKUP ARCHIVELOG ALL, or if the specified log range includes logs from prior incarnations, then RMAN backs up logs from prior incarnations to ensure availability of all logs that may be required for recovery through an OPEN RESETLOGS.
———————————-

以下以11.2.0.4版本验证:
开始备份操作:
[oracle@bys1 ~]$ rman target /

Recovery Manager: Release 11.2.0.4.0 – Production on Sat Jan 21 18:04:53 2017

Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.

connected to target database: BYS1 (DBID=4052277609)

RMAN> backup archivelog from time ‘sysdate-1’ format ‘/home/orcale/arch_%d_%t_%s.bak’;

Starting backup at 2017/01/21 18:04:55 ————>备份命令开始时间,与ALERT日志中可以对应。
current log archived ————>这句输出可以发现是做了current redolog的归档;
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=58 device type=DISK

观察ALERT日志:
Sat Jan 21 18:02:20 2017
ALTER SYSTEM ARCHIVE LOG
Sat Jan 21 18:02:20 2017
Thread 1 advanced to log sequence 99 (LGWR switch)
Current log# 3 seq# 99 mem# 0: /u01/app/oradata/bys1/redo03.log
Sat Jan 21 18:02:20 2017
Archived Log entry 131 added for thread 1 sequence 98 ID 0xf1898b69 dest 1:
Sat Jan 21 18:04:55 2017 ————>ALTER SYSTEM ARCHIVE LOG命令执行时间,与备份时输出可以对应。
ALTER SYSTEM ARCHIVE LOG
Sat Jan 21 18:04:55 2017
Thread 1 advanced to log sequence 100 (LGWR switch)
Current log# 1 seq# 100 mem# 0: /u01/app/oradata/bys1/redo01.log
Sat Jan 21 18:04:56 2017
Archived Log entry 132 added for thread 1 sequence 99 ID 0xf1898b69 dest 1:

———————11GR2 RAC环境的验证–在任意节点上执行两个节点都进行切换

RAC节点1执行备份backup archivelog操作
[oracle@bys1 ~]$ rman target /

Recovery Manager: Release 11.2.0.4.0 – Production on Tue Feb 7 12:06:02 2017

Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.

connected to target database: BYSRAC (DBID=2682487210)

RMAN> backup archivelog from time ‘sysdate-1/12’ format ‘/home/oracle/arch_%d_%t_%s.bak’;

Starting backup at 20170207 12:06:29
current log archived
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=32 instance=bysrac1 device type=DISK
channel ORA_DISK_1: starting archived log backup set
channel ORA_DISK_1: specifying archived log(s) in backup set
input archived log thread=1 sequence=49 RECID=79 STAMP=935323590
input archived log thread=2 sequence=34 RECID=80 STAMP=935323591
channel ORA_DISK_1: starting piece 1 at 20170207 12:06:39
channel ORA_DISK_1: finished piece 1 at 20170207 12:06:40
piece handle=/home/oracle/arch_BYSRAC_935323599_3.bak tag=TAG20170207T120639 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 20170207 12:06:40

RMAN> backup archivelog from time ‘sysdate-1/12’ format ‘/home/oracle/arch_%d_%t_%s.bak’;

Starting backup at 20170207 12:08:32
current log archived
using channel ORA_DISK_1
channel ORA_DISK_1: starting archived log backup set
channel ORA_DISK_1: specifying archived log(s) in backup set
input archived log thread=1 sequence=49 RECID=79 STAMP=935323590
input archived log thread=2 sequence=34 RECID=80 STAMP=935323591
input archived log thread=1 sequence=50 RECID=81 STAMP=935323713
input archived log thread=2 sequence=35 RECID=82 STAMP=935323713
channel ORA_DISK_1: starting piece 1 at 20170207 12:08:36
channel ORA_DISK_1: finished piece 1 at 20170207 12:08:37
piece handle=/home/oracle/arch_BYSRAC_935323716_4.bak tag=TAG20170207T120835 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 20170207 12:08:37

观察两个节点的ALERT日志:
节点1
Tue Feb 07 12:06:29 2017
ALTER SYSTEM ARCHIVE LOG
Tue Feb 07 12:06:30 2017
Thread 1 advanced to log sequence 50 (LGWR switch)
Current log# 2 seq# 50 mem# 0: +DATA/bysrac/onlinelog/group_2.258.927541487
Current log# 2 seq# 50 mem# 1: +DATA/bysrac/onlinelog/group_2.257.927541487
Tue Feb 07 12:06:30 2017
Archived Log entry 79 added for thread 1 sequence 49 ID 0x9fe402a6 dest 1:
Tue Feb 07 12:08:32 2017
ALTER SYSTEM ARCHIVE LOG
Tue Feb 07 12:08:33 2017
Thread 1 advanced to log sequence 51 (LGWR switch)
Current log# 1 seq# 51 mem# 0: +DATA/bysrac/onlinelog/group_1.267.927541485
Current log# 1 seq# 51 mem# 1: +DATA/bysrac/onlinelog/group_1.259.927541485
Tue Feb 07 12:08:33 2017
Archived Log entry 81 added for thread 1 sequence 50 ID 0x9fe402a6 dest 1:

节点2
Tue Feb 07 12:06:30 2017
Thread 2 advanced to log sequence 35 (LGWR switch)
Current log# 3 seq# 35 mem# 0: +DATA/bysrac/onlinelog/group_3.261.927541697
Current log# 3 seq# 35 mem# 1: +DATA/bysrac/onlinelog/group_3.269.927541699
Tue Feb 07 12:06:31 2017
Archived Log entry 80 added for thread 2 sequence 34 ID 0x9fe402a6 dest 1:
Tue Feb 07 12:08:33 2017
Thread 2 advanced to log sequence 36 (LGWR switch)
Current log# 4 seq# 36 mem# 0: +DATA/bysrac/onlinelog/group_4.270.927541701
Current log# 4 seq# 36 mem# 1: +DATA/bysrac/onlinelog/group_4.271.927541701
Tue Feb 07 12:08:33 2017
Archived Log entry 82 added for thread 2 sequence 35 ID 0x9fe402a6 dest 1:

查看备份集信息
RMAN> list backup of archivelog all;

List of Backup Sets
===================

BS Key Size Device Type Elapsed Time Completion Time
——- ———- ———– ———— —————–
3 3.00K DISK 00:00:00 20170207 12:06:39
BP Key: 3 Status: AVAILABLE Compressed: NO Tag: TAG20170207T120639
Piece Name: /home/oracle/arch_BYSRAC_935323599_3.bak

List of Archived Logs in backup set 3
Thrd Seq Low SCN Low Time Next SCN Next Time
—- ——- ———- —————– ———- ———
1 49 2344159 20170207 12:04:32 2344293 20170207 12:06:30
2 34 2344163 20170207 12:04:33 2344297 20170207 12:06:30

BS Key Size Device Type Elapsed Time Completion Time
——- ———- ———– ———— —————–
4 4.50K DISK 00:00:00 20170207 12:08:36
BP Key: 4 Status: AVAILABLE Compressed: NO Tag: TAG20170207T120835
Piece Name: /home/oracle/arch_BYSRAC_935323716_4.bak

List of Archived Logs in backup set 4
Thrd Seq Low SCN Low Time Next SCN Next Time
—- ——- ———- —————– ———- ———
1 49 2344159 20170207 12:04:32 2344293 20170207 12:06:30
1 50 2344293 20170207 12:06:30 2344410 20170207 12:08:33
2 34 2344163 20170207 12:04:33 2344297 20170207 12:06:30
2 35 2344297 20170207 12:06:30 2344414 20170207 12:08:33

10G/11G/12cr1版本中backup archivelog命令是否会触发归档current logfile操作