最近在好几个项目上遭遇LCBC无外乎都是CPU异常导致,先把这方面官方诊断的文章共享出来,后面描述一些极端场景的案例。
If you have high contention, you need to look at the statements that perform the most buffer gets and then look at their access paths to determine whether these are performing as efficiently as you would like.
Typical solutions are:-
- Look for SQL that accesses the blocks in question and determine if the repeated reads are necessary. This may be within a single session or across multiple sessions.
- Check for suboptimal SQL (this is the most common cause of the events) – look at the execution plan for the SQL being run and try to reduce the gets per executions which will minimize the number of blocks being accessed and therefore reduce the chances of multiple sessions contending for the same block.
Further information can be found in:
Note:163424.1 How To Identify a Hot Block Within The Database Buffer Cache.
Note:62172.1 Understanding and Tuning Buffer Cache and DBWR (Doc ID 62172.1)
Worked example:
Problem: Database is slow and ‘latch: cache buffers chains’ is high in the waits in AWR.
Start with Top 5 Waits:
High cache buffers chains latch indicates that there is likely to be something reading a lot of buffers. Typically the SQL with the most gets is likely to be that which is contending:
The Query with SQL_ID a9nchgksux6x2 is reading 100x more buffers than the 2nd most ‘hungry’ statement and CPU and Elapsed are off the ‘scale’ of the report. This is a prime candidate for the cause of the CBC latch issues.
You can also link this information to the Top Segments by Logical Reads:
The top object read is SALES and the top SQL is a select from SALES which appears to correlate towards this being a potential problem select.
This SQL should be investigated to see if the Gets per Exec or the Executions figure per hour has changed in any way (comparison to previous reports would show this) and if so the reasons for that change investigated and resolved.
In this case the statement is reading > 10,000 buffers per execution and executing > 15,000 times
so both of these may need to be adjusted to get better performance.
So, if the first SQL is not the culprit then look at the others.