Skip to content

snlinFreeAddrInfo

浅谈ORA-12545 / TNS-12545故障诊断思路

前一个刚经历rac安装完后遭遇的ORA-12545错误,这里就顺便把这个错误的诊断思路理出来,毕竟这个错误在10g后还是比较常见,本文只是对这个故障的处理诊断思路做一些经验上的讨论,纯为有趣。

12545的报错提示为如下:

ORA-12545 / TNS-12545 Connect failed because target host or object does not exist

先往下聊吧,这类错误通常是和hostname相关配置不正确有关,举个例子

[ora10g@ludatou ~]$ tnsping lu10g
TNS Ping Utility for Linux: Version 10.2.0.4.0 - Production on 15-APR-2014 12:08:27
Copyright (c) 1997,  2007, Oracle.  All rights reserved.
Used parameter files:

Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = ludatou)(PORT = 1523)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = lu10g)))
OK (10 msec)

[ora10g@ludatou ~]$ sqlplus luda/luda@lu10g
SQL*Plus: Release 10.2.0.4.0 - Production on Tue Apr 15 12:12:15 2014
Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

=====这里我的tnsnames配置的解析主机名是ludatou,而且采用网络验证方式登录成功
[ora10g@ludatou ~]$ cat /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1               localhost.localdomain localhost
192.168.102.128         ludatou   luda
=====在这边我把/etc/hosts文件的主机名ludatou变更为ludatouxx,变更结果如下
[ora10g@ludatou ~]$ cat /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1               localhost.localdomain localhost
192.168.102.128         ludatouxx   luda
 

再次在客户端(另外一个主机)远程登录,在响应了很久之后报错

[root@ludatou ~]# su - oracle
[oracle@ludatou ~]$ sqlplus luda/luda@lu10g
SQL*Plus: Release 11.1.0.7.0 - Production on Tue Apr 15 12:13:40 2014
Copyright (c) 1982, 2008, Oracle.  All rights reserved.

ERROR:
ORA-12545 Connect failed because target host or object does not exist
 

下来以分类场景的方式来描述这类故障的诊断:
场景1:
当连接数据库不通过listener的时候,而是使用BEQ协议来连接数据库,这个时候如果$ORACLE_HOME/bin目录下oracle文件丢失或者损坏,用户执行权限不足,以及Oracle_home(多个安装版本)的路径设置错误的情况下,使用beq协议连接oracle的用户就会遭遇ORA/TNS-12545的错误。
关于BEQ协议可以参考如下解释:

This is the bequeth oracle process. The bequeth process starts first. Later all the processes related to particular database are controlled by its bequeth process. 

场景2:
这个场景就是上面我所做的测试场景,由客户端通过listner连接服务端oracle数据库,这个时候如果客户端的tnsnames解析文件中对应service name部分的hostname和服务端的hostname不匹配,就会报错ora-12545.
在这个场景中可以使用nslookup命令来搜索对应的主机,按照上面的例子,这里使用nslookup命令搜索ludatou主机时候则会出现以下情况:

$ nslookup ludatou
Server: 192.168.102.128
Address: 192.168.1.102#53
** server can't find ludatou: ==>Indicates ludatou is not resolvable on the machine

这个场景的解决办法就是把客户端的tnsnames中的hostname部分写为ip,避免服务端主机名变更导致配置不符的情况出现。

场景3:
参考 RAC实施完遭遇ORA-12545

Cause: One of the hostname (which corresponds to public IP or VIP) is not reachable from this client machine.
When the server side load balancing is enabled in the RAC setup, the listener will redirect the connection to the least loaded node.While doing so, the server sends the packet NSPTRD containing the hostname of the corresponding machine.

The Remote Service Handler value registered with the remote Listener process via the REMOTE_LISTENER parameter is built by the LOCAL_LISTENER value on the local server.So, its necessary to check whether the local_listener / remote listener information are reflected properly in the listener services output as well.

Diagnosis:
Enable the oracle sqlnet client tracing at support level, and reproduce the issue.In the generated client trace, you would see the below information:
After NSPTCN Connect to the listener , listener sends

场景4:
listerner.ora的hostname配置错误,导致监听无法找寻到正确的监听地址,也会产生ora/tns-12545的错误。具体监听日志报错如下:

Error listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=abc)(PORT=1522)))
TNS-12545: Connect failed because target host or object does not exist
TNS-12560: TNS:protocol adapter error
TNS-00515: Connect failed because target host or object does not exist

如果打开了sqlnet的trace跟踪,可以发现类似如下的报错:

nsc2addr: (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=abc)(PORT=1522)))
nttbnd2addr: entry
snlinGetAddrInfo: entry
snlinGetAddrInfo: Invalid IP address string abc
snlinFreeAddrInfo: entry
snlinFreeAddrInfo: exit
snlinGetAddrInfo: exit
nttbnd2addr: looking up IP addr for host: abc
snlinGetAddrInfo: entry
snlinGetAddrInfo: Name resolution failed for abc
snlinFreeAddrInfo: entry
snlinFreeAddrInfo: exit
snlinGetAddrInfo: exit
nttbnd2addr: *** hostname lookup failure! ***
nttbnd2addr: exit
nserror: entry
nserror: nsres: id=0, op=78, ns=12545, ns2=12560; nt[0]=515, nt[1]=0, nt[2]=0; ora[0]=0, ora[1]=0, ora[2]=0

这类问题解决办法就是检测监听的配置是否和主机的ip,hostname配置吻合,不吻合的情况下会造成12545的错误。

场景5:
操作系统或者硬件问题也会导致产生ora/tns-12545的产生。
5.1windows 2000 with Service Pack 3 会导致这个ora/tns-12545 的产生
http://support.microsoft.com/default.aspx?scid=kb;en-us;329405
5.2Solaris系统在配置不正确情况下也会导致ora/tns-12545的产生

以上的场景描述是对1254错误的故障做一些分类分析,大概的诊断思路也是这样。

当然以上只是针对12545,oracle网络的错误有很多,对于诊断oracle网络错误,我个人的习惯诊断流程是如下
1.check listener.log确认是否有报错信息
2.check syslog是否有设备或者系统的报错信息
3.check $ORACLE_HOME/bin/oracle文件是否存在以及权限正确与否
4.check tnsnames,sqlnet,listener的配置是否与主机对应
5.在rac环境下,我会优先检查节点和client的通信情况,网络配置文件是否配置正确,最后才是taf和ld的影响
6.必要时候开启trace进一步跟踪详细信息