Skip to content

Exadata x6 遭遇大量的 TCP socket 等待事件

客户从IBM POWER迁移到EXADATA X6后遭遇大量的TCP SOCKET等待,该等待开销了大约15%的cpu资源, 系统运行未受影响,待过一段时间后,系统cpu降低,该等待减少.

经过分析客户大量使用了utl相关东西,网络延时不稳定. 该等待事件与HTTP和网络的延时有关.

和power上历史允许的性能数据对比发现历史也存在TCP SOCKET在top 5里.

如下图:

历史小机运行信息:

IMG_3634

x6运行信息:

IMG_3632

 

X6网络情况:

IMG_3633

官方给予的解释:数据库层面不用理会,关注http和网络相关情况.

“TCP Socket (KGAS)” Reference Note

This is a reference note for the wait event “TCP Socket (KGAS)” which includes the following subsections:

See Note:61998.1 for an introduction to Wait Events.

Definition:

  • Versions:10.2 – Documentation: None
  • KGAS is a component in the server which handles TCP/IP sockets which is typically used in dedicated connections in 10.2+ by some PLSQL built in packages such as UTL_HTTP and UTL_TCP.
    The KGAS interface is not involved in client / server communication but is a layer which may be used when a session on the server makes some outbound TCP/IP call using a PLSQL package such as UTL_HTTP. Packages such as UTL_FILE have also reported this wait when making an SMTP call.
  • Note that this wait event is new in 10gR2. Earlier versions of Oracle doing the same operations would still wait inside KGAS for socket operations but such waits were not instrumented and so did not show up as waits.

Individual Waits:

 Parameters:	
    P1 = operation
    P2 = Not used
    P3 = Not used

	operation   P1 gives an indication of what 
		    operation is being performed. 
		    Possible values include:
			 1 Event Post
			 2 Call
			 3 Disconnect
			 4 Receive data
			 5 Send data
			 6 Wait for an event (eg wait for data to arrive)
			 7 Sleep
			 8 Special wait (for single object)
			 9 Special wait (for multiple objects)
			10 Select operation
                

 Wait Time:	
    The wait blocks until the current operation completes
    (or times out / errors as appropriate).

Systemwide Waits:

If the TIME spent waiting for this event is significant then it is best to determine which sessions are showing the wait and drill into what those sessions are doing as the wait is usually related to whatever application code is executing .
eg: What part of the application may be using UTL_HTTP or similar and is experiencing waits. This statement can be used to see which sessions may be worth tracing:

  SELECT sid, total_waits, time_waited
    FROM v$session_event
   WHERE event='TCP Socket (KGAS)'
     and total_waits>0
   ORDER BY 3,2
  ;

Reducing Waits / Wait times:

The waits incurred depend on what sockets are being opened to which remote end points and for what reason. To help find the origin of the socket operations try:

  • Check the current SQL / module / action of V$SESSION for sessions that are waiting on the event at the time that they are waiting to try and identify any common area of application code waiting on the event.
  • Get an ERRORSTACK level 3 dump of some sessions waiting on the event. This should help show the exact PLSQL and C call stacks invoking the socket operation if the dump is taken when the session is waiting. Customers may need assistance from Oracle Support in order to get and interpret such a dump but it can help pinpoint the relevant application code.
  • Trace sessions incurring the waits including wait tracing to try and place the waits in the context of the code executing around the waits. eg: Use event 10046 level 8 or DBMS_MONITOR.SESSION_TRACE_ENABLE.
  • Use DBA_DEPENDENCIES to find any application packages which may ultimately be using UTL_HTTP or UTL_TCP underneath for some operation.

Note that there are no real tunables within Oracle for these waits as they involve the session making a call to some remote TCP/IP socket and typically waiting on data from that source. Once you know what is being called, and why, then you can determine if the response times from that remote source are sensible or not and if not why.Example:
Execute the following SQL from a session on a dedicated connection and then check the resulting trace file to see “TCP Socket (KGAS)” waits:

alter session set events '10046 trace name context forever, level 8';
select utl_http.request('http://www.oracle.com/') from dual;

Related:

Tracing User sessions Note:404833.1