Vous êtes sur la page 1sur 5

6/12/2017 AIXdiskqueuedepthtuningforperformanceunixadmin.free.

fr

UNIXADMIN.FREE.FR
JUSTANOTHERIBMBLOGANDTECHNOTESBACKUP SouscrireviaRSS

ACCUEIL

AIXdiskqueuedepthtuningforperformance Catgories

AVR/11
Source:IBMLINK AIX

BULLSYSTEM
Purpose
ThepurposeofthisdocumentistodescribehowIOsarequeuedwithSDD,SDDPCM,thediskdevicedriverandthe HACMP
adapterdevicedriver,andtoexplainhowthesecanbetunedtoincreaseperformance.Thisinformationisalsousefulfor HMC
nonSDDorSDDPCMsystems.
IBMSYSTEM
WherethisstufffitsintheIOstack
LINUX

FollowingistheIOstackfromtheapplicationtothedisk: NAS

NIM
Application
Filesystem(optional) ORACLE
LVM(optional) PERFORMANCE
SDDorSDDPCMorothermultipathdriver(ifused)
hdiskdevicedriver SAN
adapterdevicedriver
TSM
interconnecttothedisk
Disksubsystem VIDEOS
Disk
VIRTUALI/OSERVER

Notethateventhoughthediskisattachedtotheadapter,thehdiskdrivercodeisutilizedbeforetheadapterdriver
code.SothisstackrepresentstheordersoftwarecomesintoplayovertimeastheIOtraversesthestack. Liens

AIXHiperAPARs
WhydoweneedtosimultaneouslysubmitmorethanoneIOtoadisk?
AIXOpenSourcePackages
Thisimprovesperformance.Andthiswouldbeperformancefromanapplicationspointofview.Thisisespecially IBMAIXdeveloperworks
importantwithdisksubsystemswhereavirtualdisk(orLUN)isbackedbymultiplephysicaldisks.Insuchasituation,if
weonlycouldsubmitasingleIOatatime,we'dfindwegetgoodIOservicetimes,butverypoorthruput.Submitting IBMFixCentral
multipleIOstoaphysicaldiskallowsthedisktominimizeactuatormovement(usingan"elevator"algorithm)andget
IBMHTTPServer
moreIOPSthanispossiblebysubmittingoneIOatatime.Theelevatoranalogyisappropriate.Howlongwouldpeople
bewaitingtouseanelevatorifonlyonepersonatatimecouldgetonit?Insuchasituation,we'dexpectthatpeople IBMPowerBulletins
wouldwaitquiteawhiletousetheelevator(queueingtime),butoncetheygotonit,they'dgettotheirdestination
IBMRedbooks
quickly(servicetime).
InfocenterAIX6.1
WhereareIOsqueued?
InfocenterAIX7.1

AsIOstraversetheIOstack,AIXneedstokeeptrackofthemateachlayer.SoIOsareessentiallyqueuedateachlayer InfocenterAIX5L
intheIOstack.Generally,somenumberofinflightIOsmaybeissuedateachlayerandifthenumberofIOrequests
InfocenterTSM6.3
exceedsthatnumber,theyresideinawaitqueueuntiltherequiredresourcebecomesavailable.Sothereisessentially
an"inprocess"queueanda"wait"queueateachlayer(SDDandSDDPCMarealittlemorecomplicated). POWERSystemsReference

RedHatEnterpriseLinux
Atthefilesystemlayer,filesystembufferslimitthemaximumnumberofinflightIOsforeachfilesystem.AttheLVM
layer,hdiskbufferslimitthenumberofinflightIOs.AttheSDDlayer,IOsarequeuedifthedpodevice'sattribute, SQLforTivoliStorageManager
qdepth_enable,issettoyes(whichitisbydefault).SomereleasesofSDDdonotqueueIOssoitdependsonthe
TheDjangoBook
releaseofSDD.SDDPCMontheotherhanddoesnotqueueIOsbeforesendingthemtothediskdevicedriver.The
hdiskshaveamaximumnumberofinflightIOsthat'sspecifiedbyit'squeue_depthattribute.AndFCadaptersalsohave TSMadmin
amaximumnumberofinflightIOsspecifiedbynum_cmd_elems.ThedisksubsystemsthemselvesqueueIOsand
individualdiskscanacceptmultipleIOrequests.HereareanESShdisk'sattributes: TSMWiki

TagCloud

WPCumulusFlashtagcloudby
RoyTanckrequiresFlashPlayer
9orbetter.

Visiteurs

1,026Pageviews
May.11thJun.11th

Thedefaultqueue_depthis20,butcanbechangedtoashighas256forESS,DS6000andDS8000.
http://unixadmin.free.fr/?p=123 1/5
6/12/2017 AIXdiskqueuedepthtuningforperformanceunixadmin.free.fr
Thedefaultqueue_depthis20,butcanbechangedtoashighas256forESS,DS6000andDS8000.

Here'saFCadapter'sattributes:

Thedefaultqueuedepth(num_cmd_elems)forFCadaptersis200butcanbeincreasedupto2048.

Here'sthedpodevice'sattributesforonereleaseofSDD:

Whenqdepth_enable=yes,SDDwillonlysubmitqueue_depthIOstoanyunderlyinghdisk(wherequeue_depthhereis
thevaluefortheunderlyinghdisk'squeue_depthattribute).Whenqdepth_enable=no,SDDjustpassesontheIOs
directlytothehdiskdriver.Sothedifferenceis,ifqdepth_enable=yes(thedefault),IOsexceedingthequeue_depthwill
queueatSDD,andifqdepth_enable=no,thenIOsexceedthequeue_depthwillqueueinthehdisk'swaitqueue.In
otherwords,SDDwithqdepth_enable=noandSDDPCMdonotqueueIOsandinsteadjustpassthemtothehdisk
drivers.NotethatatSDD1.6,it'spreferabletousethedatapathcommandtochangeqdepth_enable,ratherthanusing
chdev,asthenit'sadynamicchange,e.g.,datapathsetqdepthdisablewillsetittono.SomereleasesofSDDdon't
includeSDDqueueing,andsomedo,andsomereleasesdon'tshowtheqdepth_enableattribute.Eithercheckthe
manualforyourversionofSDDortrythedatapathcommandtoseeifitsupportsturningthisfeatureoff.

Ifyou'veusedbothSDDandSDDPCM,you'llrememberthatwithSDD,eachLUNhasacorrespondingvpathandan
hdiskforeachpathtothevpathorLUN.AndwithSDDPCM,youjusthaveonehdiskperLUN.Thus,withSDDonecan
submitqueue_depthx#pathstoaLUN,whilewithSDDPCM,onecanonlysubmitqueue_depthIOstotheLUN.Ifyou
switchfromSDDusing4pathstoSDDPCM,thenyou'dwanttosettheSDDPCMhdisksto4xthatofSDDhdisksforan
equivalenteffectivequeuedepth.AndmigratingtoSDDPCMisrecommendedasit'smorestrategicthanSDD.

Boththehdiskandadapterdrivershavean"inprocess"and"wait"queues.Oncethequeuelimitisreached,theIOswait
untilanIOcompletes,freeingupaslotintheservicequeue.Theinprocessqueueisalsosometimesreferredtoasthe
"service"queue

It'sworthmentioning,thatmanyapplicationswillnotgeneratemanyinflightIOs,especiallysinglethreadedapplications
thatdon'tuseasynchronousIO.ApplicationsthatuseasynchronousIOarelikelytogeneratemoreinflightIOs.

Whattoolsareavailabletomonitorthequeues?

ForAIX,onecanuseiostat(atAIX5.3orlater)andsar(5.1orlater)tomonitorsomeofthequeues.TheiostatD
commandgeneratesoutputsuchas:

Here,theavgwqszistheaveragewaitqueuesize,andavgsqszistheaverageservicequeuesize.Theaveragetime
spentinthewaitqueueisavgtime.Thesqfullvaluehaschangedfrominitiallybeingacountofthetimeswe've
submittedanIOtoafullqueue,tonowwhereit'stherateofIOssubmittedtoafullqueue.Theexamplereportshows
thepriorcase(acountofIOssubmittedtoafullqueue),whilenewerreleasestypicallyshowdecimalfractionsindicating
arate.It'snicethatiostatDseparatesreadsandwrites,aswewouldexpecttheIOservicetimestobedifferentwhen
wehaveadisksubsystemwithcache.Themostusefulreportfortuningisjustrunning"iostatD"whichshowsstatistics
sincesystemboot,assumingthesystemisconfiguredtocontinuouslymaintaindiskIOhistory(run#lsattrElsys0,or
smittychgsystoseeiftheiostatattributeissettotrue).

ThesardcommandchangedatAIX5.3,andgeneratesoutputsuchas:

Theavwaitandavservaretheaveragetimesspentinthewaitqueueandservicequeuerespectively.Andavservhere
wouldcorrespondtoavgservintheiostatoutput.TheavquevaluechangedatAIX5.3,itrepresentstheaverage
numberofIOsinthewaitqueue,andpriorto5.3,itrepresentstheaveragenumberofIOsintheservicequeue.

http://unixadmin.free.fr/?p=123 2/5
6/12/2017 AIXdiskqueuedepthtuningforperformanceunixadmin.free.fr

SDDprovidesthe"datapathquerydevstats"and"datapathqueryadaptstats"commandstoshowhdiskandadapter
queuestatistics.SDDPCMsimilarlyhas"pcmpathquerydevstats"and"pcmpathqueryadaptstats".Youcanrefertothe
SDDmanualforsyntax,optionsandexplanationsofallthefields.Here'ssomedevstatsoutputforasingleLUN:

Here,we'remainlyinterestedintheMaximumfieldwhichindicatesthemaximumnumberofIOssubmittedtothedevice
sincesystemboot.NotethatMaximumfordevstatswillnotexceedqueue_depthx#pathsforSDDwhen
qdepth_enable=yes.ButMaximumforadaptstatscanexceednum_cmd_elemsasitrepresentsthemaximumnumberof
IOssubmittedtotheadapterdriverandincludesIOsforboththeserviceandwaitqueues.If,inthiscase,wehave2
pathsandareusingthedefaultqueue_depthof20,thenthe40indicateswe'vefilledthequeueatleastonceand
increasingqueue_depthcanhelpperformance.ForSDDPCM,iftheMaximumvalueequalsthehdisk'squeue_depth,then
thehdiskdriverqueuewasfilledduringtheinterval,andincreasingqueue_depthisusuallyappropriate.

OnecansimilarlymonitoradapterqueuesandIOPS:foradapterIOPS,run#iostatat<#ofintervals>andforadapter
queueinformation,run#iostataD,optionallywithanintervalandnumberofintervals.

Howtotune

First,oneshouldnotindiscriminatelyjustincreasethesevalues.It'spossibletooverloadthedisksubsystemorcause
problemswithdeviceconfigurationatboot.Sotheapproachofaddingupthehdisk'squeue_depthsandusingthatto
determinethenum_cmd_elemsisn'twise.Instead,it'sbettertousethemaximumIOstoeachdevicefortuning.When
youincreasethequeue_depthsandnumberofinflightIOsthataresenttothedisksubsystem,theIOservicetimesare
likelytoincrease,butthroughputwillincrease.IfIOservicetimesstartapproachingthedisktimeoutvalue,thenyou're
submittingmoreIOsthanthedisksubsystemcanhandle.IfyoustartseeingIOtimeoutsanderrorsintheerrorlog
indicatingproblemscompletingIOs,thenthisisthetimetolookforhardwareproblemsortomakethepipesmaller.

Agoodgeneralrulefortuningqueue_depths,isthatonecanincreasequeue_depthsuntilIOservicetimesstart
exceeding15msforsmallrandomreadsorwritesoroneisn'tfillingthequeues.OnceIOservicetimesstartincreasing,
we'vepushedthebottleneckfromtheAIXdiskandadapterqueuestothedisksubsystem.Twoapproachestotuning
queuedepthare1)useyourapplicationandtunethequeuesfromthator2)useatesttooltoseewhatthedisk
subsystemcanhandleandtunethequeuesfromthatbasedonwhatthedisksubsystemcanhandle.Thendisktool(part
ofthenstresspackageavailableontheinternetathttp://www
941.ibm.com/collaboration/wiki/display/WikiPtype/nstress)canbeusedtostressthedisksubsystemtoseewhatitcan
handle.Theauthor'spreferenceistotunebasedonyourapplicationIOrequirements,especiallywhenthediskisshared
withotherservers.

CacheswillaffectyourIOservicetimesandtestingresults.Readcachehitratestypicallyincreasethesecondtimeyou
runatestandaffectrepeatabilityoftheresults.Writecachehelpsperformanceuntil,andif,thewritecachesfillupat
whichtimeperformancegoesdown,solongerrunningtestswithhighwriteratescanshowadropinperformanceover
time.Forreadcacheeitherprimethecache(preferably)orflushthecache.Andforwritecaches,considermonitoring
thecachetoseeifitfillsupandrunyourtestslongenoughtoseeifthecachecontinuestofillupfasterthanthedata
canbeoffloadedtodisk.Anotherissuewhentuningandusingshareddisksubsystems,isthatIOfromtheotherservers
willalsoaffectrepeatability.

Examiningthedevstats,ifyouseethatforSDD,theMaximumfield=queue_depthx#pathsandqdepth_enable=yes,
thenthisindicatesthatincreasingthequeue_depthforthehdisksmayhelpperformanceatleasttheIOswillqueueon
thedisksubsystemratherthaninAIX.It'sreasonabletoincreasequeuedepthsabout50%atatime.

Regardingtheqdepth_enableparameter,thedefaultisyeswhichessentiallyhasSDDhandlingtheIOsbeyond
queue_depthfortheunderlyinghdisks.Settingittonoresultsinthehdiskdevicedriverhandlingtheminit'swaitqueue.
Inotherwords,withqdepth_enable=yes,SDDhandlesthewaitqueue,otherwisethehdiskdevicedriverhandlesthe
waitqueue.ThereareerrorhandlingbenefitstoallowingSDDtohandletheseIOs,e.g.,ifusingLVMmirroringacross
twoESSs.WithheavyIOloadsandalotofqueueinginSDD(whenqdepth_enable=yes)it'smoreefficienttoallowthe
hdiskdevicedriverstohandlerelativelyshorterwaitqueuesratherthanSDDhandlingaverylongwaitqueuebysetting
qdepth_enable=no.Inotherwords,SDD'squeuehandlingissinglethreadedwherethere'sathreadforhandlingeach
hdisk'squeue.Soiferrorhandlingisofprimaryimportance(e.g.whenLVMmirroringacrossdisksubsystems)thenleave
qdepth_enable=yes.Otherwise,settingqdepth_enable=nomoreefficientlyhandlesthewaitqueueswhentheyarelong.
Notethatoneshouldsettheqdepth_enableparameterviathedatapathcommandasit'sadynamicchangethatway
(usingchdevisnotdynamicforthisparameter).

Iferrorhandlingisofconcern,thenit'salsoadvisable,assumingthediskisSANswitchattached,tosetthefscsidevice
attributefc_err_recovtofast_failratherthanthedefaultofdelayed_fail.Andifmakingthatchange,Ialsorecommend
changingthefscsidevicedyntrkattributetoyesratherthanthedefaultofno.TheseattributesassumeaSANswitchthat
supportsthisfeature.
http://unixadmin.free.fr/?p=123 3/5
6/12/2017 AIXdiskqueuedepthtuningforperformanceunixadmin.free.fr

Fortheadapters,lookattheadaptstatscolumn.Andsetnum_cmd_elems=Maximumor200whicheverisgreater.Unlike
devstatswithqdepth_enable=yes,Maximumforadaptstatscanexceednum_cmd_elems.

ThenafterrunningyourapplicationduringpeakIOperiodslookatthestatisticsandtuneagain.

It'salsoreasonabletousetheiostatDcommandorsardtoprovideanindicationifthequeue_depthsneedtobe
increased.

Thedownsideofsettingqueuedepthstoohigh,isthatthedisksubsystemwon'tbeabletohandletheIOrequestsina
timelyfashion,andmayevenrejecttheIOorjustignoreit.ThiscanresultinanIOtimeout,andIOerrorrecoverycode
willbecalled.Thisisn'tadesirablesituation,astheCPUendsupdoingmoreworktohandleIOsthannecessary.Ifthe
IOeventuallyfails,thenthiscanleadtoanapplicationcrashorworse.


QueuedepthswithVIO

WhenusingVIO,oneconfiguresVSCSIadapters(foreachvirtualadapterinaVIOS,knownasavhostdevice,therewill
beamatchingVSCSIadapterinaVIOC).Theseadaptershaveafixedqueuedepththatvariesdependingonhowmany
VSCSILUNsareconfiguredfortheadapter.Thereare512commandelementsofwhich2areusedbytheadapter,3are
reservedforeachVSCSILUNforerrorrecoveryandtherestareusedforIOrequests.Thus,withthedefault
queue_depthof3forVSCSILUNs,thatallowsforupto85LUNstouseanadapter:(5122)/(3+3)=85rounding
down.Soifweneedhigherqueuedepthsforthedevices,thenthenumberofLUNsperadapterisreduced.E.G.,ifwe
wanttouseaqueue_depthof25,thatallows510/28=18LUNs.WecanconfiguremultipleVSCSIadapterstohandle
manyLUNswithhighqueuedepths.eachrequiringadditionalmemory.OnemayhavemorethanoneVSCSIadapteron
aVIOCconnectedtothesameVIOSifyouneedmorebandwidth.

Also,oneshouldsetthequeue_depthattributeontheVIOC'shdisktomatchthatofthemappedhdisk'squeue_depthon
theVIOS.

Foraformula,themaximumnumberofLUNspervirtualSCSIadapter(vhostontheVIOSorvscsiontheVIOC)is
=INT(510/(Q+3))whereQisthequeue_depthofalltheLUNs(assumingtheyareallthesame).

Notethattochangethequeue_depthonanhdiskattheVIOSrequiresthatweunmapthediskfromtheVIOCand
remapitback.

IfusingNPIV,thenifyouincreasenum_cmd_elemsonthevirtualFC(vFC)adapter,thenyoushouldalsoincreasethe
settingontherealFCadapter.

AspecialnoteontheFCadaptermax_xfer_sizeattribute

Thisattributeforthefscsidevice,whichcontrolsthemaximumIOsizetheadapterdevicedriverwillhandle,alsocontrols
amemoryareausedbytheadapterfordatatransfers.Whenthedefaultvalueisused(max_xfer_size=0x100000)the
memoryareais16MBinsize.Whensettingthisattributetoanyotherallowablevalue(say0x200000)thenthememory
areais128MBinsize.AtAIX6.1TL2orlaterachangewasmadeforvirtualFCadapterssotheDMAmemoryareais
always128MBevenwiththedefaultmax_xfer_size.ThismemoryareaisaDMAmemoryarea,butitisdifferentthan
theDMAmemoryareacontrolledbythelg_term_dmaattribute(whichisusedforIOcontrol).Thedefaultvaluefor
lg_term_dmaof0x800000isusuallyadequate.

SoforheavyIOandespeciallyforlargeIOs(suchasforbackups)it'srecommendedtosetmax_xfer_size=0x200000for
AIXlevelsearlierthanAIX6.1TL2.

Thefcstatcommandcanalsobeusedtoexaminewhetherornotincreasingnum_cmd_elemsormax_xfer_sizecould
increaseperformance

Thisshowsanexampleofanadapterthathassufficientvaluesfornum_cmd_elemsandmax_xfer_size.Nonzerovalue
wouldindicateasituationinwhichIOsqueuedattheadapterduetolackofresources,andincreasingnum_cmd_elems
andmax_xfer_sizewouldbeappropriate.

Notethatchangingmax_xfer_sizeusesmemoryinthePCIHostBridgechipsattachedtothePCIslots.Thesalesmanual,
regardingthedualport4GbpsPCIXFCadapterstatesthat"IfplacedinaPCIXslotratedasSDRcompatibleand/or
hastheslotspeedof133MHz,theAIXvalueofthemax_xfer_sizemustbekeptatthedefaultsettingof0x100000(1
megabyte)whenbothportsareinuse.ThearchitectureoftheDMAbufferfortheseslotsdoesnotaccommodatelarger
max_xfer_sizesettings"

IftherearetoomanyFCadaptersandtoomanyLUNsattachedtotheadapter,thiswillleadtoissuesconfiguringthe
LUNs.Errorswilllooklike:

http://unixadmin.free.fr/?p=123 4/5
6/12/2017 AIXdiskqueuedepthtuningforperformanceunixadmin.free.fr

RecommendedActions
PERFORMPROBLEMDETERMINATIONPROCEDURES

Soifyougettheseerrors,you'llneedtochangethemax_xfer_sizebacktothedefaultvalue.Alsonotethatifyouare
bootingfromSAN,ifyouencounterthiserror,youwon'tbeabletoboot,sobesuretohaveabackoutplanifyouplan
tochangethisandarebootingfromSAN.

Taggcomme:tuning Laisserun
commentaire

Commentaires() Trackbacks(0) (Souscrireauxcommentairesdecetarticle)

Aucuncommentairepourl'instant

Leaveacomment
Nom (required)

Adressedecontact (required)

Siteweb

Soumettre

TivoliStorageManagerServer1024DatabaseConnectionLimitonAIX AIXRDACerror

Copyright2017unixadmin.free.frPoweredbyWordPress
TOP

http://unixadmin.free.fr/?p=123 5/5

Vous aimerez peut-être aussi