xifenfei 发表于 2014-3-17 16:29:20

log buffer space 等待事件

难得的一个机会,看到一份awr报告中log buffer space非常大,对其分析结论和大家分享

AWR分析
DB NameDB IdInstanceInst numReleaseRACHost
XIFENFEI
1794962186xifenfei
110.2.0.4.0
NO
xff



Snap IdSnap TimeSessionsCursors/Session
Begin Snap:
1507606-Aug-1308:00:46102.5
End Snap:
1507706-Aug-1309:00:301651.1
Elapsed:

59.73(mins)


DB Time:

1,667.67 (mins)


这里可以看出来,对于一个只有8C,的服务器来说相对是比较繁忙的系统Cache Sizes

BeginEnd
Buffer Cache:
8,192M8,192MStd Block Size:
8K
Shared Pool Size:
1,872M1,872MLog Buffer:
14,328K
这里显示log buffer只有14MLoad Profile

PerSecondPerTransaction
Redo size:
5,550,364.905,541,052.96
Logical reads:
190,349.81190,030.46
Block changes:
34,206.7334,149.34
Physical reads:
1,142.051,140.14
Physical writes:
826.48825.10
User calls:
409.66408.97
Parses:
107.70107.52
Hard parses:
11.8511.83
Sorts:
24.8124.77
Logons:
0.020.02
Executes:
103.00102.83
Transactions:
1.00

Redo的量对于这样的系统来说,算是比较多,而且他的redo都是大事物(平均每秒一个事务,每个事务5M的redo),和他这里操作的业务类型有关系,大量的insert into select操作Top 5 Timed Events

EventWaitsTime(s)AvgWait(ms)% TotalCall TimeWaitClass
log bufferspace
60,46539,98366140.0Configuration

db file sequential read
1,217,73118,7801518.8User I/O

buffer busy waits
25,41311,35544711.3Concurrency

db file parallel write
24,0937,2413017.2System I/O

CPU time
6,679 6.7

Top 5中log buffer space占据了40%,而且平均等待661ms,确实属于异常,在3600秒钟,他总的等待占39983,近11个逻辑cpu被它占据,实属不正常情况。

Background Wait Events

EventWaits%Time -outsTotal Wait Time(s)Avg wait (ms)Waits /txn
db file parallelwrite24,1100.007,2493016.72
log file parallelwrite5,8330.002,8804941.62
control fileparallel write2,4240.00163670.68
log buffer space27132.101495510.08
control filesequential read7,9960.0094122.23
db filesequential read2,3180.0064270.65
这里看到后台进程的等待中,主要是write相关的等待,而且log file parallel write写的等待平均在494ms,绝对不属于正常情况,这样慢的redo buffer写到redo log中,而数据库产生的redo量也是比较大,从而出现了redo buffer写出的速度没有写入的速度快,从而出现大量log buffer space等待

Tablespace IO Stats

TablespaceReadsAv Rd(ms)Av Blks/RdWritesBuffer WaitsAv Buf Wt(ms)
UNDOTBS113,24733.391.001,379,56923,798454.02
NNC_DATA03907,13712.963.4069,73319,79670.59
NNC_INDEX03408,93415.461.0049,89928,12618.34
NNC_DATA0299,46218.431.0112,81710,05721.13
NNC_INDEX0124,02610.611.001,24700.00
NNC_DATA0110,40623.4918.029939627.78
NNC_INDEX026,68925.341.015300.00
TEMP1,31792.5813.015,04300.00
SYSTEM5,24617.091.6132933.33
SYSAUX65230.921.0542500.00
USERS11028.361.0010600.00
这里的平均读取block 时间很多都超过20ms,而写的平均时间有些在627ms,这个太离谱了,也证明了io的读写性能也不给力。

Instance Activity Stats - Thread Activity

StatisticTotalper Hour
log switches(derived)5656.25
Redo切换的也有点过于频繁,证明客户的redolog size设置不合理

分析结论客户的erp系统因为中间表的insert into select操作产生大量redo,加上客户的datafile 和redo logfile都存放在同一个raid 5之上,导致redo产生的速度比redo 写入到logfile的速度快,从而出现大量log buffer space等待。针对该客户的问题,大概解决思路:
1.      增加redo buffer大小,例如64M(希望能够缓解部分log buffer space)2.      增加redo size到合理值,最好能够保证redo几分钟到30分钟切换一次3.      把redo log和datafile分离开,redo log不要存放在raid 5之上4.      优化业务逻辑,减少erp系统中间表产生的redo 量5.      使用性能更加高的存储设备

LogBuffer Space分析思路ProblemConfirmation:1.      The time spent actively in thelocal database is significant2.      Only certain sessions, queriesor jobs are experiencing slowness (not throughout the database)3.      Log_buffer Parameter<redo size per sec * 600'4.      'log buffer space' waits are asignificant component of the overall DB Time"Log Buffer Space"Wait for space in the log buffer because we are writing into the logbuffer faster than lgwr can write it out.Reducing Waits1.      Consider making the log bufferbigger if it is small2.      Moving the log files to fasterdisks3.      Increase size of redo logs

xifenfei 发表于 2014-3-17 16:31:46

整理的pdf见附件:
页: [1]
查看完整版本: log buffer space 等待事件