跟踪oracle 进程链接建立时候发生的情况
本方法主要使用触发器的方式对新建立的进程自动进行10046的trace并进行分析的方式,期间使用到truss,需要掌握truss的使用方法,具体如下:
创建用户
create user trace_user identified by trace_user account unlock; grant create session,dba to trace_user; |
创建登陆触发器
create or replace trigger sys.set_trace after logon on database when (user = 'trace_user') declare lcommand varchar(200); begin execute immediate 'alter session set statistics_level=all'; execute immediate 'alter session set max_dump_file_size=unlimited'; execute immediate 'alter session set events ''10046 trace name context forever, level 12'''; end set_trace; / |
尝试登陆
sqlplus "trace_user/trace_user" << eof exit; eof |
这将在$ ORACLE_BASE / admin / <db_name> / udump目录下生成(10046)跟踪文件
然后通过查看“e =”部分查看数据库调用(PARSE,BIND,EXEC,FETCH)是否有大量时间,这意味着数据库调用所消耗的持续时间
通过查看“ela =”部分检查等待事件是否有大量时间,这意味着等待事件消耗的持续时间(可能通过进行系统调用)
PARSING IN CURSOR #5 len=131 dep=1 uid=0 oct=3 lid=0 tim=15687899202263 hv=1389591971 ad='ca9a7948' select privilege#,nvl(col#,0),max(mod(nvl(option$,0),2))from objauth$ where obj#=:1 and grantee#=:2 group by privilege#,nvl(col#,0) END OF STMT EXEC #5:c=0,e=145,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,tim=15687899202260 WAIT #5: nam='db file sequential read' ela= 30005 file#=1 block#=24208 blocks=1 obj#=-1 tim=15687899232346 FETCH #5:c=0,e=30075,p=1,cr=2,cu=0,mis=0,r=0,dep=1,og=4,tim=15687899232396 |
没有数据库调用占用大量时间
less oracle_ora_1118270.trc | grep -i ",e=" | cut -d , -f2 > karlarao.txt ; sed -n 's/e=/ /p' karlarao.txt | sort -nr | less 358746 183162 61293 44661 32580 30075 28695 26950 25837 24244 ... |
没有等待事件耗费大量时间
less oracle_ora_1118270.trc | grep -i "ela=" | cut -d " " -f8 | sort -nr | less 30005 28624 13253 11592 9650 |
没有 “os thread startup”事件!
less oracle_ora_1118270.trc | grep -i "ela=" | cut -d "=" -f2 | uniq 'SQL*Net message to client' ela 'SQL*Net message from client' ela 'SQL*Net message to client' ela 'SQL*Net message from client' ela 'SQL*Net message to client' ela 'SQL*Net message from client' ela 'SQL*Net message to client' ela 'SQL*Net message from client' ela 'db file sequential read' ela 'SQL*Net message to client' ela 'SQL*Net message from client' ela 'db file sequential read' ela 'SQL*Net message to client' ela 'db file sequential read' ela 'SQL*Net message from client' ela 'SQL*Net message to client' ela 'SQL*Net message from client' ela 'db file sequential read' ela 'SQL*Net message to client' ela 'SQL*Net message from client' ela 'SQL*Net message to client' ela 'SQL*Net message from client' ela 'SQL*Net message to client' ela 'SQL*Net message from client' ela 'SQL*Net message to client' ela 'SQL*Net message from client' ela |
下来对系统调用进行检测,这里使用truss工具
这个工具可以让你做一些应用程序跟踪,这里是跟踪SQL * Plus,并显示应用程序对外部库和内核的调用
truss -c sqlplus "/ as sysdba" << EOF exit; EOF truss -d sqlplus "/ as sysdba" << EOF exit; EOF |
“-c”开关计算跟踪的系统调用,故障和信号
“-d”开关显示每行的时间戳(以秒为单位)。跟踪输出的第一行将显示测量各个时间戳的基准时间
这里下面可以发现是DNS服务导致sqlplus“/ as sysdba”操作减慢,并且有“os线程启动”等待,下面的文本是“truss”输出的部分,当在/etc/resolv.conf和/ etc / hosts上读取时,“(sleep……)”事件表示减速。
ENOTTY和ECONNREFUSED是看到的错误,这里此文件/usr/include/sys/errno.h上找到有关错误消息的更多详细信息。以下是它们的解释
#define ENOTTY 25 /* Inappropriate I/O control operation */ #define ECONNREFUSED 79 /* Connection refused */ |
根据“truss -d”的记录,可以看到时间信息从2.3379秒(读取/etc/resolv.conf) – > 7.3477秒 – > 17.3489秒 – > 37.3555秒(读数) / etc / hosts) – > 70.3863秒(已连接)
0.3172: gethostname(0x0FFFFFFFFFFF9920, 256) = 0 0.3175: _getpid() = 1102056 0.3177: open("/etc/resolv.conf", O_RDONLY) = 8 0.3180: kioctl(8, 22528, 0x0000000000000000, 0x0000000000000000) Err#25 ENOTTY 0.3182: kioctl(8, 22528, 0x0000000000000000, 0x0000000000000000) Err#25 ENOTTY kread(8, " d o m a i n this is the domain name, i removed it".., 4096) = 69 kread(8, " d o m a i n this is the domain name, i removed it".., 4096) = 0 0.3190: close(8) = 0 0.3194: socket(2, 2, 0) = 8 0.3197: getsockopt(8, 65535, 4104, 0x0FFFFFFFFFFF70E4, 0x0FFFFFFFFFFF70E0) = 0 0.3199: connext(8, 0x09001000A0018600, 16) = 0 0.3202: send(8, 0x0FFFFFFFFFFF7FF0, 41, 0) = 41 0.3205: _poll(0x0FFFFFFFFFFF7190, 1, 5000) = 1 0.3208: nrecvfrom(8, 0x0FFFFFFFFFFF9320, 1024, 0, 0x0FFFFFFFFFFF7950, 0x0FFFFFFFFFFF7178) Err#79 ECONNREFUSED 0.3210: close(8) = 0 0.3212: socket(2, 2, 0) = 8 0.3215: sendto(8, 0x0FFFFFFFFFFF7FF0, 41, 0, 0x09001000A0018610, 16) = 41 0.3217: _poll(0x0FFFFFFFFFFF7190, 1, 5000) = 1 0.3220: nrecvfrom(8, 0x0FFFFFFFFFFF9320, 1024, 0, 0x0FFFFFFFFFFF7950, 0x0FFFFFFFFFFF7178) = 108 0.3222: close(8) = 0 0.3224: socket(2, 2, 0) = 8 0.3227: getsockopt(8, 65535, 4104, 0x0FFFFFFFFFFF70E4, 0x0FFFFFFFFFFF70E0) = 0 0.3229: connext(8, 0x09001000A0018600, 16) = 0 0.3231: send(8, 0x0FFFFFFFFFFF7FF0, 28, 0) = 28 0.3233: _poll(0x0FFFFFFFFFFF7190, 1, 5000) = 1 0.3236: nrecvfrom(8, 0x0FFFFFFFFFFF9320, 1024, 0, 0x0FFFFFFFFFFF7950, 0x0FFFFFFFFFFF7178) Err#79 ECONNREFUSED 0.3238: close(8) = 0 0.3240: socket(2, 2, 0) = 8 0.3243: sendto(8, 0x0FFFFFFFFFFF7FF0, 28, 0, 0x09001000A0018610, 16) = 28 0.3245: _poll(0x0FFFFFFFFFFF7190, 1, 5000) = 1 0.3248: nrecvfrom(8, 0x0FFFFFFFFFFF9320, 1024, 0, 0x0FFFFFFFFFFF7950, 0x0FFFFFFFFFFF7178) = 28 0.3250: close(8) = 0 0.3252: socket(2, 2, 0) = 8 0.3373: sendto(8, 0x0FFFFFFFFFFF7FF0, 28, 0, 0x09001000A0018600, 16) = 28 2.3379: _poll(0x0FFFFFFFFFFF7190, 1, 5000) (sleeping...) 2.3379: _poll(0x0FFFFFFFFFFF7190, 1, 5000) = 0 5.3468: close(8) = 0 5.3470: socket(2, 2, 0) = 8 5.3472: sendto(8, 0x0FFFFFFFFFFF7FF0, 28, 0, 0x09001000A0018600, 16) = 28 7.3477: _poll(0x0FFFFFFFFFFF7190, 1, 10000) (sleeping...) 7.3477: _poll(0x0FFFFFFFFFFF7190, 1, 10000) = 0 15.3479: close(8) = 0 15.3482: socket(2, 2, 0) = 8 15.3484: sendto(8, 0x0FFFFFFFFFFF7FF0, 28, 0, 0x09001000A0018600, 16) = 28 17.3489: _poll(0x0FFFFFFFFFFF7190, 1, 20000) (sleeping...) 17.3489: _poll(0x0FFFFFFFFFFF7190, 1, 20000) = 0 35.3491: close(8) = 0 35.3495: getdomainname(0x0FFFFFFFFFFF9480, 256) = 0 35.3497: open("/etc/hosts", O_RDONLY) = 8 35.3500: kioctl(8, 22528, 0x0000000000000000, 0x0000000000000000) Err#25 ENOTTY 35.3502: kfcntl(8, F_SETFD, 0x0000000000000001) = 0 35.3505: kioctl(8, 22528, 0x0000000000000000, 0x0000000000000000) Err#25 ENOTTY kread(8, " # @ ( # ) 4 7\t 1 . 1".., 4096) = 3453 kread(8, " # @ ( # ) 4 7\t 1 . 1".., 4096) = 0 35.3514: close(8) = 0 kwrite(9, "01 �0603 s".., 432) = 432 37.3555: kread(10, "16060280".., 2064) (sleeping...) kread(10, "06 E06\b".., 2064) = 1605 70.3762: open("/oracle/app/product/10.2/db_1/rdbms/mesg/oraus.msb", O_RDONLY) = 8 70.3764: kfcntl(8, F_SETFD, 0x0000000000000001) = 0 70.3767: lseek(8, 0, 0) = 0 kread(8, "1513 "011303\t\t".., 256) = 256 70.3772: lseek(8, 512, 0) = 512 kread(8, "1A Q 5 C [ V u �85 �9480".., 512) = 512 70.3776: lseek(8, 1024, 0) = 1024 kread(8, "18 $ 4 = G S".., 512) = 512 70.3781: lseek(8, 50688, 0) = 50688 kread(8, "\f05 ] P05 ^".., 512) = 512 70.3786: lseek(8, 512, 0) = 512 kread(8, "1A Q 5 C [ V u �85 �9480".., 512) = 512 70.3790: lseek(8, 1024, 0) = 1024 kread(8, "18 $ 4 = G S".., 512) = 512 70.3795: lseek(8, 50688, 0) = 50688 kread(8, "\f05 ] P05 ^".., 512) = 512 70.3799: lseek(8, 512, 0) = 512 kread(8, "1A Q 5 C [ V u �85 �9480".., 512) = 512 70.3804: lseek(8, 1024, 0) = 1024 kread(8, "18 $ 4 = G S".., 512) = 512 70.3808: lseek(8, 50688, 0) = 50688 kread(8, "\f05 ] P05 ^".., 512) = 512 70.3813: lseek(8, 512, 0) = 512 kread(8, "1A Q 5 C [ V u �85 �9480".., 512) = 512 70.3817: lseek(8, 1024, 0) = 1024 kread(8, "18 $ 4 = G S".., 512) = 512 70.3822: lseek(8, 51712, 0) = 51712 kread(8, "\n0589 D058A".., 512) = 512 70.3827: close(8) = 0 kwrite(9, " <0611 k".., 60) = 60 kread(10, " �06\b".., 2064) = 179 kwrite(9, " U0603 h".., 85) = 85 kread(10, "1606\b".., 2064) = 22 kwrite(9, " U0603 h".., 85) = 85 kread(10, "1606\b".., 2064) = 22 70.3854: lseek(4, 512, 0) = 512 kread(4, "17 �".., 512) = 512 70.3858: lseek(4, 1024, 0) = 1024 kread(4, "16 * R h819E".., 512) = 512 70.3863: lseek(4, 4096, 0) = 4096 kread(4, "\f82 P86".., 512) = 512 kwrite(1, "\n", 1) = 1 Connected to: kwrite(1, " C o n n e c t e d t o".., 14) = 14 Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production kwrite(1, " O r a c l e D a t a b".., 77) = 77 With the Partitioning, OLAP, Data Mining and Real Application Testing options kwrite(1, " W i t h t h e P a r".., 78) = 78 kwrite(1, "\n", 1) = 1 70.3882: kfcntl(1, F_GETFL, 0x0000000000000008) = 2 70.3886: __libc_sbrk(0x0000000000030020) = 0x000000001023C880 70.3892: access("login.sql", 0) Err#2 ENOENT 70.3895: access("/oracle/app/product/10.2/db_1/sqlplus/admin/glogin.sql", 0) = 0 70.3898: statfs("/oracle/app/product/10.2/db_1/sqlplus/admin/glogin.sql", 0x0FFFFFFFFFFFDA10) = 0 |
事实证明确实是dns出现了问题,在调整后迅速恢复了响应时间
3.0713: gethostname(0x0FFFFFFFFFFF9900, 256) = 0 3.0716: _getpid() = 2539762 3.0719: open("/etc/resolv.conf", O_RDONLY) = 8 3.0721: kioctl(8, 22528, 0x0000000000000000, 0x0000000000000000) Err#25 ENOTTY 3.0724: kioctl(8, 22528, 0x0000000000000000, 0x0000000000000000) Err#25 ENOTTY kread(8, " d o m a i n this is the domain name, i removed it".., 4096) = 70 kread(8, " d o m a i n this is the domain name, i removed it".., 4096) = 0 3.0731: close(8) = 0 3.0735: socket(2, 2, 0) = 8 3.0740: getsockopt(8, 65535, 4104, 0x0FFFFFFFFFFF70C4, 0x0FFFFFFFFFFF70C0) = 0 3.0742: connext(8, 0x09001000A0018600, 16) = 0 3.0746: send(8, 0x0FFFFFFFFFFF7FD0, 41, 0) = 41 3.0750: _poll(0x0FFFFFFFFFFF7170, 1, 5000) = 1 3.0753: nrecvfrom(8, 0x0FFFFFFFFFFF9300, 1024, 0, 0x0FFFFFFFFFFF7930, 0x0FFFFFFFFFFF7158) = 108 3.0756: send(8, 0x0FFFFFFFFFFF7FD0, 28, 0) = 28 3.0758: _poll(0x0FFFFFFFFFFF7170, 1, 5000) = 1 3.0761: nrecvfrom(8, 0x0FFFFFFFFFFF9300, 1024, 0, 0x0FFFFFFFFFFF7930, 0x0FFFFFFFFFFF7158) = 28 3.0764: close(8) = 0 3.0767: getdomainname(0x0FFFFFFFFFFF9460, 256) = 0 3.0769: open("/etc/hosts", O_RDONLY) = 8 3.0771: kioctl(8, 22528, 0x0000000000000000, 0x0000000000000000) Err#25 ENOTTY 3.0774: kfcntl(8, F_SETFD, 0x0000000000000001) = 0 3.0777: kioctl(8, 22528, 0x0000000000000000, 0x0000000000000000) Err#25 ENOTTY kread(8, " # @ ( # ) 4 7\t 1 . 1".., 4096) = 3453 kread(8, " # @ ( # ) 4 7\t 1 . 1".., 4096) = 0 3.0786: close(8) = 0 kwrite(9, "01 �0603 s".., 432) = 432 kread(10, "06 H06\b".., 2064) = 1608 3.0973: open("/oracle/app/product/10.2/db_1/rdbms/mesg/oraus.msb", O_RDONLY) = 8 3.0975: kfcntl(8, F_SETFD, 0x0000000000000001) = 0 3.0978: lseek(8, 0, 0) = 0 kread(8, "1513 "011303\t\t".., 256) = 256 3.0983: lseek(8, 512, 0) = 512 kread(8, "1A Q 5 C [ V u �85 �9480".., 512) = 512 3.0988: lseek(8, 1024, 0) = 1024 kread(8, "18 $ 4 = G S".., 512) = 512 3.0993: lseek(8, 50688, 0) = 50688 kread(8, "\f05 ] P05 ^".., 512) = 512 3.0997: lseek(8, 512, 0) = 512 kread(8, "1A Q 5 C [ V u �85 �9480".., 512) = 512 3.1002: lseek(8, 1024, 0) = 1024 kread(8, "18 $ 4 = G S".., 512) = 512 3.1006: lseek(8, 50688, 0) = 50688 kread(8, "\f05 ] P05 ^".., 512) = 512 3.1011: lseek(8, 512, 0) = 512 kread(8, "1A Q 5 C [ V u �85 �9480".., 512) = 512 3.1015: lseek(8, 1024, 0) = 1024 kread(8, "18 $ 4 = G S".., 512) = 512 3.1021: lseek(8, 50688, 0) = 50688 kread(8, "\f05 ] P05 ^".., 512) = 512 3.1025: lseek(8, 512, 0) = 512 kread(8, "1A Q 5 C [ V u �85 �9480".., 512) = 512 3.1029: lseek(8, 1024, 0) = 1024 kread(8, "18 $ 4 = G S".., 512) = 512 3.1034: lseek(8, 51712, 0) = 51712 kread(8, "\n0589 D058A".., 512) = 512 3.1038: close(8) = 0 kwrite(9, " <0611 k".., 60) = 60 kread(10, " �06\b".., 2064) = 179 kwrite(9, " U0603 h".., 85) = 85 kread(10, "1606\b".., 2064) = 22 kwrite(9, " U0603 h".., 85) = 85 kread(10, "1606\b".., 2064) = 22 3.1064: lseek(4, 512, 0) = 512 kread(4, "17 �".., 512) = 512 3.1069: lseek(4, 1024, 0) = 1024 kread(4, "16 * R h819E".., 512) = 512 3.1073: lseek(4, 4096, 0) = 4096 kread(4, "\f82 P86".., 512) = 512 kwrite(1, "\n", 1) = 1 Connected to: kwrite(1, " C o n n e c t e d t o".., 14) = 14 Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production kwrite(1, " O r a c l e D a t a b".., 77) = 77 With the Partitioning, OLAP, Data Mining and Real Application Testing options kwrite(1, " W i t h t h e P a r".., 78) = 78 kwrite(1, "\n", 1) = 1 3.1095: kfcntl(1, F_GETFL, 0x0000000000000008) = 2 3.1099: __libc_sbrk(0x0000000000030020) = 0x000000001023C880 3.1105: access("login.sql", 0) Err#2 ENOENT 3.1109: access("/oracle/app/product/10.2/db_1/sqlplus/admin/glogin.sql", 0) = 0 3.1112: statfs("/oracle/app/product/10.2/db_1/sqlplus/admin/glogin.sql", 0x0FFFFFFFFFFFD9F0) = 0 |