Skip to content

Grid and Cluster - 6. page

RAC增加节点步骤

 

 

参考:http://space.itpub.net/35489/viewspace-563077

           Metalink  : [ID 1279891.1]

           http://www.itpub.net/thread-1361850-1-1.html

 详细文档稍后就会跟进。

 

 1、安装 节点的操作系统,与已经运行的节点一致。

2、配置系统参数和ORACLE的安装环境。

3、从运行节点的机器上把$ORACLE_HOME、$ORACLE-BASE、/etc/ora*复制到新安装机器上对应的目录,要同源地址一致。

4、运行新装机$ORACLE_HOME下的root.sh。

5、修改所有机器$ORACLE_HOME/oracm/admin下rac 配置、/etc/hosts配置。

6、确认当前数据库 的MAXINSTANCES大于等于您新加机器后的节点数,否则需重建控制文件(但一般都够,默认好像是16还是32来着);

7、配置spfile,可以用命令alter system set 参数名=值 scope=spfile;完成后重启就生效了。

也可以把spfile通过create pfile=… from spfile;生成pfile再修改,方便一点,如下所示要修改的内容。

<SID3>.instance_name=RAC3

<SID3>.instance_number=3

<SID3>.local_listener=LISTENER_RAC3

<SID3>.thread=3

<SID3>.undo_tablespace=UNDOTBS3

完成后要通过create spfile from pfile=…..建立回去后配置才生效哦。

8、在每个机器的$ORACLE_HOME/network/admin/tnsnames.ora中添加,并复制到各节点:

LISTENER_RAC3 = (ADDRESS = (PROTOCOL = TCP)(HOST = <node3>)(PORT = 1521))

9、在数据库中添加新的redo logfile:

alter database add logfile thread 3

group 5 (‘/dev/RAC/redo3_01_100.dbf’) size 100M,

group 6 (‘/dev/RAC/redo3_02_100.dbf’) size 100M;

alter database enable public thread 3;

10、在数据库添加新的undotbs:

CREATE UNDO TABLESPACE UNDOTBS3 DATAFILE ‘/dev/RAC/undotbs_03_210.dbf’ SIZE 200M AUTOEXTEND ON NEXT  5120K MAXSIZE UNLIMITED

11、确认新节点的环境变量(ORACLE_HOME、ORACLE_SID等),然后启动第三个实例。

12、可以通过srvctl的配置增加对新节点的管理 。具体查看srvctl帮助,例:srvctl -h   srvctl config -h

ksvcreate: Process(m000) creation failed

10. 7. 2009, 12.09
Alert.log:
==========
Fri Jul 10 09:24:19 2009
WARNING: inbound connection timed out (ORA-3136)
Fri Jul 10 09:30:11 2009
Process m000 died, see its trace file
Fri Jul 10 09:30:11 2009
ksvcreate: Process(m000) creation failed
Fri Jul 10 09:31:12 2009
Process m000 died, see its trace file

Trace file:
===========
*** SERVICE NAME:(SYS$BACKGROUND) 2009-07-09 20:35:37.278
*** SESSION ID:(3295.1) 2009-07-09 20:35:37.278
ktsmgtur(): TUR was not tuned for 1644 secs
ktsmg_advance_slot(): MMNL advances slot after 1786 seconds
*** 2009-07-10 09:30:11.004
Process m000 is dead (pid=32562, state=3):
Unable to schedule a MMON slave at: Auto Flush Main 1
Attempt to create slave process failed.
Can happen for several reasons:
– No process state objects
– Reached OS set limits
– A shutdown was going on
Check alert log for more details.

CHECKS:
=======
SQL> show parameter process

NAME TYPE VALUE
———————————— ———– ——————————
processes integer 3000

SQL> show parameter session

NAME TYPE VALUE
———————————— ———– ——————————
sessions integer 3305

SQL> select count(*) from v$session;

COUNT(*)
———-
2994

SELECT COUNT(*) FROM v$process;

COUNT(*)
———-
2990

Solution
========
– kill processes which generates the processes ( usually from http)
– restart database
– this error coresponds with ORA-3136

RAC 10g: srvctl / vipca 执行报错 libpthread.so.0

执行srvctl时候报错

error while loading shared libraries: libpthread.so.0
2个节点都这样,这个是比较常见的错误

原因就是在srvctl本身代码

这个问题在aix 5.3 和 oracle linux 5.5/rhel 5.5确认

在安装时候执行root.sh碰见这个错误的话解决办法:

vi srvctl

LD_ASSUME_KERNEL=2.4.19
export LD_ASSUME_KERNEL
之后增加
unset LD_ASSUME_KERNEL

PS:
这个需要在每个节点都这样修改。

错误2:
在oracle用户执行srvctl报错 error while loading shared libraries: libpthread.so.0

这个时候你可以which srvctl可以看到路径是在
$ORACLE_HOME/bin下的,但是srvctl是在$ORACLE_BASE/product/10.2/crs/bin下,这时候你即使修改path也是无法解决的
这里我采用的办法 make a sofe link

way:
cd $ORACLE_HOME/bin
ln -s /ORACLE_BASE/product/10.2/crs/bin/srvctl srvctl

解决。

在vipca文件ARGUMENTS=””行之前 增加 unset LD_ASSUME_KERNEL

Rac备份概念

提供给公司开发人员的Rac的备份概念:

1.ocr的备份
ocrconfig -help | grep port

-export [-s online] – Export cluster register contents to a file
-import
– Import cluster registry contents from a file
从这部分可以知道 ocr的备份方式是以类似exp/imp的方式进行导出操作(也可以使用dd一般首选oracle的备份方式)

这里测测试:

ORACLE@node1:/oracle/product/10.2.0/crs/log/node1/racg>ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 2
Total space (kbytes) : 838552
Used space (kbytes) : 3812
Available space (kbytes) : 834740
ID : 1008840117
Device/File Name : /dev/raw/raw3
Device/File integrity check succeeded

Device/File not configured

Cluster registry integrity check succeeded

以上信息可以得到这里的OCR是放在/de/raw/raw3

备份方式:

ocrconfig -export /oracle/ocr_20110420.bak

[root@node1 oracle]# ocrconfig -showbackup

node1 2011/04/19 14:42:20 /oracle/product/10.2.0/crs/cdata/crs

node1 2011/04/19 10:42:20 /oracle/product/10.2.0/crs/cdata/crs

node1 2011/04/19 06:42:20 /oracle/product/10.2.0/crs/cdata/crs

node2 2011/04/18 14:06:02 /oracle/product/10.2.0/crs/cdata/crs

node2 2011/04/15 04:09:45 /oracle/product/10.2.0/crs/cdata/crs

恢复过程将会很简单,通过 ocrconfig -restore 就可以恢复这里的物理备份。
通过strings可以将其中的字符串取出来看一个印象:
strings /oracle/ocr_20110420.bak |sort -u

2.voting disk备份

voting的备份以dd的方式为主

[root@node1 oracle]# clear

[root@node1 oracle]# crsctl query css votedisk
0. 0 /dev/raw/raw2

located 1 votedisk(s).

通过 crsctl query css votedisk
查询到voting所在的raw

然后通过dd备份
[root@node1 oracle]# dd if=/dev/raw/raw2 of=/oracle/voting_20110420.bak
x417760+0 records in
417760+0 records out
213893120 bytes (214 MB) copied, 126.633 seconds, 1.7 MB/s
当需要恢复时,通过dd进行恢复即可。

3。Rac的数据备份

一:归档日志放在共享存储的情况
在这类情况备份只要执行以下的命令就可以成功全备

run{
allocate channel ch1 device type disk;
backup
#incremental level 1 comulative
database;
release channel ch1;

allocate channel ch1 device type disk;
backup archivelog all delete input;
release channel ch1;

allocate channel ch1 device type disk;
backup format ‘/oracle/ctl_%U_%T_%D’ current controlfile;
release channel ch1}

二:归档日志放在2个节点的情况

run{
allocate channel ch1 device type disk connect sys/oracle@ldrac1;
allocate channel ch2 device type disk connect sys/oracle@ldrac2;
backup
#incremental level 1 comulative
database;
release channel ch1;
release channel ch2;

allocate channel ch1 device type disk connect sys/oracle@ldrac1;
allocate channel ch2 device type disk connect sys/oracle@ldrac2;
backup archivelog all delete input;
release channel ch1;
release channel ch2;

allocate channel ch1 device type disk connect sys/oracle@ldrac1;
allocate channel ch2 device type disk connect sys/oracle@ldrac2;
backup format ‘/oracle/ctl_%U_%T_%D’ current controlfile;
release channel ch1;
release channel ch2;}

注意的地方:
集群的数据备份 根据 集群本身的环境而定
所以在确认一个集群环境的时候注意先查看这个集群的存储规划和存储选型