Vous êtes sur la page 1sur 14

Caching Tutorial for Web Authors and Webmasters

http://www.mnot.net/cache_docs/#WORK

forWebAuthorsandWebmasters

Thisisaninformationaldocument.Althoughtechnicalinnature,itattemptsto maketheconceptsinvolvedunderstandableandapplicableinrealworld situations.Becauseofthis,someaspectsofthematerialaresimpliedoromitted, forthesakeofclarity.Ifyouareinterestedintheminutiaofthesubject,please exploretheReferencesandFurtherInformationattheend.

1. WhatsaWebCache?Whydopeopleusethem? 2. KindsofWebCaches 1. BrowserCaches 2. ProxyCaches 3. ArentWebCachesbadforme?WhyshouldIhelpthem? 4. HowWebCachesWork 5. How(andhownot)toControlCaches 1. HTMLMetaTagsvs.HTTPHeaders 2. PragmaHTTPHeaders(andwhytheydontwork) 3. ControllingFreshnesswiththeExpiresHTTPHeader 4. CacheControlHTTPHeaders 5. ValidatorsandValidation 6. TipsforBuildingaCacheAwareSite 7. WritingCacheAwareScripts 8. FrequentlyAskedQuestions 9. ImplementationNotesWebServers 10. ImplementationNotesServerSideScripting 11. ReferencesandFurtherInformation 12. AboutThisDocument

AWebcachesitsbetweenoneormoreWebservers(alsoknownasoriginservers)and aclientormanyclients,andwatchesrequestscomeby,savingcopiesoftheresponses likeHTMLpages,imagesandles(collectivelyknownasrepresentations)for itself.Then,ifthereisanotherrequestforthesameURL,itcanusetheresponsethat ithas,insteadofaskingtheoriginserverforitagain. TherearetwomainreasonsthatWebcachesareused: ToreducelatencyBecausetherequestissatisedfromthecache(whichis closertotheclient)insteadoftheoriginserver,ittakeslesstimeforittogetthe representationanddisplayit.ThismakestheWebseemmoreresponsive. ToreducenetworktracBecauserepresentationsarereused,itreducesthe amountofbandwidthusedbyaclient.Thissavesmoneyiftheclientispayingfor trac,andkeepstheirbandwidthrequirementslowerandmoremanageable.

1 of 14

12/4/2012 1:24 AM

Caching Tutorial for Web Authors and Webmasters

http://www.mnot.net/cache_docs/#WORK

BROWSER CACHES
IfyouexaminethepreferencesdialogofanymodernWebbrowser(likeInternet Explorer,SafariorMozilla),youllprobablynoticeacachesetting.Thisletsyouset asideasectionofyourcomputersharddisktostorerepresentationsthatyouveseen, justforyou.Thebrowsercacheworksaccordingtofairlysimplerules.Itwillcheckto makesurethattherepresentationsarefresh,usuallyonceasession(thatis,theonce inthecurrentinvocationofthebrowser). Thiscacheisespeciallyusefulwhenusershitthebackbuttonorclickalinktoseea pagetheyvejustlookedat.Also,ifyouusethesamenavigationimagesthroughout yoursite,theyllbeservedfrombrowserscachesalmostinstantaneously.

PROXY CACHES
Webproxycachesworkonthesameprinciple,butamuchlargerscale.Proxiesserve hundredsorthousandsofusersinthesameway;largecorporationsandISPsoften setthemupontheirrewalls,orasstandalonedevices(alsoknownasintermediaries). Becauseproxycachesarentpartoftheclientortheoriginserver,butinsteadareout onthenetwork,requestshavetoberoutedtothemsomehow.Onewaytodothisisto useyourbrowsersproxysettingtomanuallytellitwhatproxytouse;anotherisusing interception.InterceptionproxieshaveWebrequestsredirectedtothembythe underlyingnetworkitself,sothatclientsdontneedtobeconguredforthem,oreven knowaboutthem. Proxycachesareatypeofsharedcache;ratherthanjusthavingonepersonusing them,theyusuallyhavealargenumberofusers,andbecauseofthistheyarevery goodatreducinglatencyandnetworktrac.Thatsbecausepopularrepresentations arereusedanumberoftimes.

GATEWAY CACHES
Alsoknownasreverseproxycachesorsurrogatecaches,gatewaycachesarealso intermediaries,butinsteadofbeingdeployedbynetworkadministratorstosave bandwidth,theyretypicallydeployedbyWebmastersthemselves,tomaketheirsites morescalable,reliableandbetterperforming. Requestscanberoutedtogatewaycachesbyanumberofmethods,buttypically someformofloadbalancerisusedtomakeoneormoreofthemlookliketheorigin servertoclients. Contentdeliverynetworks(CDNs)distributegatewaycachesthroughouttheInternet (orapartofit)andsellcachingtointerestedWebsites.SpeederaandAkamaiare examplesofCDNs. Thistutorialfocusesmostlyonbrowserandproxycaches,althoughsomeofthe informationissuitableforthoseinterestedingatewaycachesaswell.

WebcachingisoneofthemostmisunderstoodtechnologiesontheInternet. Webmastersinparticularfearlosingcontroloftheirsite,becauseaproxycachecan hidetheirusersfromthem,makingitdiculttoseewhosusingthesite. Unfortunatelyforthem,evenifWebcachesdidntexist,therearetoomanyvariables ontheInternettoassurethattheyllbeabletogetanaccuratepictureofhowusers seetheirsite.Ifthisisabigconcernforyou,thistutorialwillteachyouhowtogetthe statisticsyouneedwithoutmakingyoursitecacheunfriendly. Anotherconcernisthatcachescanservecontentthatisoutofdate,orstale. However,thistutorialcanshowyouhowtocongureyourservertocontrolhowyour contentiscached. Ontheotherhand,ifyouplanyoursitewell,cachescanhelpyourWebsiteload

2 of 14

12/4/2012 1:24 AM

Caching Tutorial for Web Authors and Webmasters

http://www.mnot.net/cache_docs/#WORK

faster,andsaveloadonyourserverandInternet link.Thedierencecanbedramatic;asitethatis diculttocachemaytakeseveralsecondstoload, whileonethattakesadvantageofcachingcan seeminstantaneousincomparison.Userswill appreciateafastloadingsite,andwillvisitmore often. Thinkofitthisway;manylargeInternet companiesarespendingmillionsofdollarssetting upfarmsofserversaroundtheworldtoreplicate theircontent,inordertomakeitasfasttoaccess aspossiblefortheirusers.Cachesdothesamefor you,andtheyreevenclosertotheenduser.Best ofall,youdonthavetopayforthem. Thefactisthatproxyandbrowsercacheswillbe usedwhetheryoulikeitornot.Ifyoudont congureyoursitetobecachedcorrectly,itwillbe cachedusingwhateverdefaultsthecaches administratordecidesupon.

CDNsarean interesting development,because unlikemanyproxy caches,theirgateway cachesarealigned withtheinterestsof theWebsitebeing cached,sothatthese problemsarentseen. However,evenwhen youuseaCDN,you stillhavetoconsider thattherewillbe proxyandbrowser cachesdownstream.

Allcacheshaveasetofrulesthattheyusetodeterminewhentoservea representationfromthecache,ifitsavailable.Someoftheserulesaresetinthe protocols(HTTP1.0and1.1),andsomearesetbytheadministratorofthecache (eithertheuserofthebrowsercache,ortheproxyadministrator). Generallyspeaking,thesearethemostcommonrulesthatarefollowed(dontworryif youdontunderstandthedetails,itwillbeexplainedbelow): 1. Iftheresponsesheaderstellthecachenottokeepit,itwont. 2. Iftherequestisauthenticatedorsecure(i.e.,HTTPS),itwontbecached. 3. Acachedrepresentationisconsideredfresh(thatis,abletobesenttoaclient withoutcheckingwiththeoriginserver)if: Ithasanexpirytimeorotheragecontrollingheaderset,andisstillwithin thefreshperiod,or Ifthecachehasseentherepresentationrecently,anditwasmodied relativelylongago. Freshrepresentationsareserveddirectlyfromthecache,withoutcheckingwith theoriginserver. 4. Ifarepresentationisstale,theoriginserverwillbeaskedtovalidateit,ortellthe cachewhetherthecopythatithasisstillgood. 5. Undercertaincircumstancesforexample,whenitsdisconnectedfroma networkacachecanservestaleresponseswithoutcheckingwiththeorigin server. Ifnovalidator(anETagorLast-Modifiedheader)ispresentonaresponse,andit doesnthaveanyexplicitfreshnessinformation,itwillusuallybutnotalwaysbe considereduncacheable. Together,freshnessandvalidationarethemostimportantwaysthatacacheworks withcontent.Afreshrepresentationwillbeavailableinstantlyfromthecache,whilea validatedrepresentationwillavoidsendingtheentirerepresentationoveragainifit hasntchanged.

ThereareseveraltoolsthatWebdesignersandWebmasterscanusetonetunehow cacheswilltreattheirsites.Itmayrequiregettingyourhandsalittledirtywithyour serversconguration,buttheresultsareworthit.Fordetailsonhowtousethese

3 of 14

12/4/2012 1:24 AM

Caching Tutorial for Web Authors and Webmasters

http://www.mnot.net/cache_docs/#WORK

toolswithyourserver,seetheImplementation sectionsbelow.

HTML META TAGS AND HTTP HEADERS


HTMLauthorscanputtagsinadocuments<HEAD>sectionthatdescribeits attributes.Thesemetatagsareoftenusedinthebeliefthattheycanmarkadocument asuncacheable,orexpireitatacertaintime. Metatagsareeasytouse,butarentveryeective.Thatsbecausetheyreonly honoredbyafewbrowsercaches,notproxycaches(whichalmostneverreadthe HTMLinthedocument).WhileitmaybetemptingtoputaPragma:nocachemeta tagintoaWebpage,itwontnecessarilycauseittobekeptfresh. Ontheotherhand,trueHTTPheadersgiveyoua lotofcontroloverhowbothbrowsercachesand proxieshandleyourrepresentations.Theycantbe seenintheHTML,andareusuallyautomatically generatedbytheWebserver.However,youcan controlthemtosomedegree,dependingonthe serveryouuse.Inthefollowingsections,youllsee whatHTTPheadersareinteresting,andhowto applythemtoyoursite. HTTPheadersaresentbytheserverbeforethe HTML,andonlyseenbythebrowserandany intermediatecaches.TypicalHTTP1.1response headersmightlooklikethis:

Ifyoursiteishosted atanISPorhosting farmandtheydont giveyoutheabilityto setarbitraryHTTP headers(like Expiresand Cache-Control), complainloudly; thesearetools

necessaryfordoing HTTP/1.1 200 OK Date: Fri, 30 Oct 1998 13:19:41 GMT yourjob. Server: Apache/1.3.3 (Unix) Cache-Control: max-age=3600, must-revalidate Expires: Fri, 30 Oct 1998 14:19:41 GMT Last-Modified: Mon, 29 Jun 1998 02:28:12 GMT ETag: "3e86-410-3596fbbc" Content-Length: 1040 Content-Type: text/html
TheHTMLwouldfollowtheseheaders,separatedbyablankline.Seethe ImplementationsectionsforinformationabouthowtosetHTTPheaders.

PRAGMA HTTP HEADERS (AND WHY THEY DONT WORK)


ManypeoplebelievethatassigningaPragma: no-cacheHTTPheadertoa representationwillmakeituncacheable.Thisisnotnecessarilytrue;theHTTP specicationdoesnotsetanyguidelinesforPragmaresponseheaders;instead, Pragmarequestheaders(theheadersthatabrowsersendstoaserver)arediscussed. Althoughafewcachesmayhonorthisheader,themajoritywont,anditwonthave anyeect.Usetheheadersbelowinstead.

CONTROLLING FRESHNESS WITH THE EXPIRES HTTP HEADER


TheExpiresHTTPheaderisabasicmeansofcontrollingcaches;ittellsallcaches howlongtheassociatedrepresentationisfreshfor.Afterthattime,cacheswillalways checkbackwiththeoriginservertoseeifadocumentischanged.Expiresheaders aresupportedbypracticallyeverycache. MostWebserversallowyoutosetExpiresresponseheadersinanumberofways. Commonly,theywillallowsettinganabsolutetimetoexpire,atimebasedonthelast timethattheclientretrievedtherepresentation(lastaccesstime),oratimebasedon thelasttimethedocumentchangedonyourserver(lastmodicationtime). Expiresheadersareespeciallygoodformakingstaticimages(likenavigationbars andbuttons)cacheable.Becausetheydontchangemuch,youcansetextremelylong expirytimeonthem,makingyoursiteappearmuchmoreresponsivetoyourusers. Theyrealsousefulforcontrollingcachingofapagethatisregularlychanged.For

4 of 14

12/4/2012 1:24 AM

Caching Tutorial for Web Authors and Webmasters

http://www.mnot.net/cache_docs/#WORK

instance,ifyouupdateanewspageonceadayat6am,youcansettherepresentation toexpireatthattime,socacheswillknowwhentogetafreshcopy,withoutusers havingtohitreload. TheonlyvaluevalidinanExpiresheaderisaHTTPdate;anythingelsewillmost likelybeinterpretedasinthepast,sothattherepresentationisuncacheable.Also, rememberthatthetimeinaHTTPdateisGreenwichMeanTime(GMT),notlocal time. Forexample:


Expires: Fri, 30 Oct 1998 14:19:41 GMT

AlthoughtheExpiresheaderisuseful,ithas somelimitations.First,becausetheresadate involved,theclocksontheWebserverandthe cachemustbesynchronised;iftheyhavea dierentideaofthetime,theintendedresults wontbeachieved,andcachesmightwrongly considerstalecontentasfresh. AnotherproblemwithExpiresisthatitseasyto forgetthatyouvesetsomecontenttoexpireata particulartime.IfyoudontupdateanExpires timebeforeitpasses,eachandeveryrequestwill gobacktoyourWebserver,increasingloadand latency.

Itsimportanttomake surethatyourWeb serversclockis accurateifyouusethe Expiresheader. Onewaytodothisis usingtheNetwork TimeProtocol(NTP); talktoyourlocal systemadministrator tondoutmore.

CACHE-CONTROL HTTP HEADERS

HTTP1.1introducedanewclassofheaders,Cache-Controlresponseheaders,to giveWebpublishersmorecontrolovertheircontent,andtoaddressthelimitationsof Expires. UsefulCache-Controlresponseheadersinclude: max-age=[seconds]speciesthemaximumamountoftimethata representationwillbeconsideredfresh.SimilartoExpires,thisdirectiveis relativetothetimeoftherequest,ratherthanabsolute.[seconds]isthenumberof secondsfromthetimeoftherequestyouwishtherepresentationtobefreshfor. s-maxage=[seconds]similartomax-age,exceptthatitonlyappliestoshared (e.g.,proxy)caches. publicmarksauthenticatedresponsesascacheable;normally,ifHTTP authenticationisrequired,responsesareautomaticallyprivate. privateallowscachesthatarespecictooneuser(e.g.,inabrowser)tostore theresponse;sharedcaches(e.g.,inaproxy)maynot. no-cacheforcescachestosubmittherequesttotheoriginserverforvalidation beforereleasingacachedcopy,everytime.Thisisusefultoassurethat authenticationisrespected(incombinationwithpublic),ortomaintainrigid freshness,withoutsacricingallofthebenetsofcaching. no-storeinstructscachesnottokeepacopyoftherepresentationunderany conditions. must-revalidatetellscachesthattheymustobeyanyfreshnessinformation yougivethemaboutarepresentation.HTTPallowscachestoservestale representationsunderspecialconditions;byspecifyingthisheader,youretelling thecachethatyouwantittostrictlyfollowyourrules. proxy-revalidatesimilartomust-revalidate,exceptthatitonlyapplies toproxycaches. Forexample:
Cache-Control: max-age=3600, must-revalidate

WhenbothCache-ControlandExpiresarepresent,Cache-Controltakes precedence.IfyouplantousetheCache-Controlheaders,youshouldhavealook

5 of 14

12/4/2012 1:24 AM

Caching Tutorial for Web Authors and Webmasters

http://www.mnot.net/cache_docs/#WORK

attheexcellentdocumentationinHTTP1.1;seeReferencesandFurtherInformation.

VALIDATORS AND VALIDATION


InHowWebCachesWork,wesaidthatvalidationisusedbyserversandcachesto communicatewhenarepresentationhaschanged.Byusingit,cachesavoidhavingto downloadtheentirerepresentationwhentheyalreadyhaveacopylocally,buttheyre notsureifitsstillfresh. Validatorsareveryimportant;ifoneisntpresent,andthereisntanyfreshness information(ExpiresorCache-Control)available,cacheswillnotstorea representationatall. Themostcommonvalidatoristhetimethatthedocumentlastchanged,as communicatedinLast-Modifiedheader.Whenacachehasarepresentationstored thatincludesaLast-Modifiedheader,itcanuseittoasktheserverifthe representationhaschangedsincethelasttimeitwasseen,withan If-Modified-Sincerequest. HTTP1.1introducedanewkindofvalidatorcalledtheETag.ETagsareunique identiersthataregeneratedbytheserverandchangedeverytimetherepresentation does.BecausetheservercontrolshowtheETagisgenerated,cachescanbesurethat iftheETagmatcheswhentheymakeaIf-None-Matchrequest,therepresentation reallyisthesame. AlmostallcachesuseLastModiedtimesasvalidators;ETagvalidationisalso becomingprevalent. MostmodernWebserverswillgeneratebothETagandLast-Modifiedheadersto useasvalidatorsforstaticcontent(i.e.,les)automatically;youwonthavetodo anything.However,theydontknowenoughaboutdynamiccontent(likeCGI,ASP ordatabasesites)togeneratethem;seeWritingCacheAwareScripts.

Besidesusingfreshnessinformationandvalidation,thereareanumberofotherthings youcandotomakeyoursitemorecachefriendly. UseURLsconsistentlythisisthegoldenruleofcaching.Ifyouservethesame contentondierentpages,todierentusers,orfromdierentsites,itshoulduse thesameURL.Thisistheeasiestandmosteectivewaytomakeyoursitecache friendly.Forexample,ifyouuse/index.htmlinyourHTMLasareferenceonce, alwaysuseitthatway. Useacommonlibraryofimagesandotherelementsandreferbacktothemfrom dierentplaces. MakecachesstoreimagesandpagesthatdontchangeoftenbyusingaCacheControl: max-ageheaderwithalargevalue. Makecachesrecogniseregularlyupdatedpagesbyspecifyinganappropriate maxageorexpirationtime. Ifaresource(especiallyadownloadablele)changes,changeitsname.Thatway, youcanmakeitexpirefarinthefuture,andstillguaranteethatthecorrectversion isserved;thepagethatlinkstoitistheonlyonethatwillneedashortexpirytime. Dontchangelesunnecessarily.Ifyoudo,everythingwillhaveafalselyyoung Last-Modifieddate.Forinstance,whenupdatingyoursite,dontcopyoverthe entiresite;justmovethelesthatyouvechanged. Usecookiesonlywherenecessarycookiesarediculttocache,andarent neededinmostsituations.Ifyoumustuseacookie,limititsusetodynamicpages. MinimizeuseofSSLbecauseencryptedpagesarenotstoredbysharedcaches, usethemonlywhenyouhaveto,anduseimagesonSSLpagessparingly. CheckyourpageswithREDbotitcanhelpyouapplymanyoftheconceptsin thistutorial.

6 of 14

12/4/2012 1:24 AM

Caching Tutorial for Web Authors and Webmasters

http://www.mnot.net/cache_docs/#WORK

Bydefault,mostscriptswontreturnavalidator(aLast-Modified orETag response header)orfreshnessinformation(ExpiresorCache-Control).Whilesomescripts reallyaredynamic(meaningthattheyreturnadierentresponseforeveryrequest), many(likesearchenginesanddatabasedrivensites)canbenetfrombeingcache friendly. Generallyspeaking,ifascriptproducesoutputthatisreproduciblewiththesame requestatalatertime(whetheritbeminutesordayslater),itshouldbecacheable.If thecontentofthescriptchangesonlydependingonwhatsintheURL,itis cacheable;iftheoutputdependsonacookie,authenticationinformationorother externalcriteria,itprobablyisnt. Thebestwaytomakeascriptcachefriendly(aswellasperformbetter)istodump itscontenttoaplainlewheneveritchanges.TheWebservercanthentreatitlike anyotherWebpage,generatingandusingvalidators,whichmakesyourlifeeasier. Remembertoonlywritelesthathavechanged,sotheLast-Modifiedtimesare preserved. Anotherwaytomakeascriptcacheableinalimitedfashionistosetanagerelated headerforasfarinthefutureaspractical.Althoughthiscanbedonewith Expires,itsprobablyeasiesttodosowithCache-Control: max-age,which willmaketherequestfreshforanamountoftimeaftertherequest. Ifyoucantdothat,youllneedtomakethescriptgenerateavalidator,andthen respondtoIf-Modified-Sinceand/orIf-None-Matchrequests.Thiscanbe donebyparsingtheHTTPheaders,andthenrespondingwith304 Not Modifiedwhenappropriate.Unfortunately,thisisnotatrivaltask. Someothertips; DontusePOSTunlessitsappropriate.ResponsestothePOSTmethodarent keptbymostcaches;ifyousendinformationinthepathorquery(viaGET), cachescanstorethatinformationforthefuture. DontembeduserspecicinformationintheURLunlessthecontentgeneratedis completelyuniquetothatuser. Dontcountonallrequestsfromausercomingfromthesamehost,because cachesoftenworktogether. GenerateContent-Lengthresponseheaders.Itseasytodo,anditwillallowthe responseofyourscripttobeusedinapersistentconnection.Thisallowsclientsto requestmultiplerepresentationsononeTCP/IPconnection,insteadofsettingupa connectionforeveryrequest.Itmakesyoursiteseemmuchfaster. SeetheImplementationNotesformorespecicinformation.

WHAT ARE THE MOST IMPORTANT THINGS TO MAKE CACHEABLE?


Agoodstrategyistoidentifythemostpopular,largestrepresentations(especially images)andworkwiththemrst.

HOW CAN I MAKE MY PAGES AS FAST AS POSSIBLE WITH CACHES?


Themostcacheablerepresentationisonewithalongfreshnesstimeset.Validation doeshelpreducethetimethatittakestoseearepresentation,butthecachestillhas tocontacttheoriginservertoseeifitsfresh.Ifthecachealreadyknowsitsfresh,it willbeserveddirectly.

I UNDERSTAND THAT CACHING IS GOOD, BUT I NEED TO KEEP STATISTICS ON HOW MANY PEOPLE VISIT MY PAGE!
Ifyoumustknoweverytimeapageisaccessed,selectONEsmallitemonapage(or thepageitself),andmakeituncacheable,bygivingitasuitableheaders.Forexample, youcouldrefertoa1x1transparentuncacheableimagefromeachpage.The Refererheaderwillcontaininformationaboutwhatpagecalledit.

7 of 14

12/4/2012 1:24 AM

Caching Tutorial for Web Authors and Webmasters

http://www.mnot.net/cache_docs/#WORK

Beawarethateventhiswillnotgivetrulyaccuratestatisticsaboutyourusers,and is unfriendlytotheInternetandyourusers;itgeneratesunnecessarytrac,andforces peopletowaitforthatuncacheditemtobedownloaded.Formoreinformationabout this,seeOnInterpretingAccessStatisticsinthereferences.

HOW CAN I SEE A REPRESENTATIONS HTTP HEADERS?


ManyWebbrowsersletyouseetheExpiresandLast-Modifiedheadersareina pageinfoorsimilarinterface.Ifavailable,thiswillgiveyouamenuofthepageand anyrepresentations(likeimages)associatedwithit,alongwiththeirdetails. Toseethefullheadersofarepresentation,youcanmanuallyconnecttotheWeb serverusingaTelnetclient. Todoso,youmayneedtotypetheport(bedefault,80)intoaseparateeld,oryou mayneedtoconnecttowww.example.com:80orwww.example.com 80(notethe space).ConsultyourTelnetclientsdocumentation. Onceyouveopenedaconnectiontothesite,typearequestfortherepresentation. Forinstance,ifyouwanttoseetheheadersforhttp://www.example.com /foo.html,connecttowww.example.com,port80,andtype:
GET /foo.html HTTP/1.1 [return] Host: www.example.com [return][return]

PresstheReturnkeyeverytimeyousee[return];makesuretopressittwiceatthe end.Thiswillprinttheheaders,andthenthefullrepresentation.Toseetheheaders only,substituteHEADforGET.

MY PAGES ARE PASSWORD-PROTECTED; HOW DO PROXY CACHES DEAL WITH THEM?


Bydefault,pagesprotectedwithHTTPauthenticationareconsideredprivate;they willnotbekeptbysharedcaches.However,youcanmakeauthenticatedpagespublic withaCacheControl:publicheader;HTTP1.1compliantcacheswillthenallowthem tobecached. Ifyoudlikesuchpagestobecacheable,butstillauthenticatedforeveryuser,combine theCache-Control: publicandno-cacheheaders.Thistellsthecachethatit mustsubmitthenewclientsauthenticationinformationtotheoriginserverbefore releasingtherepresentationfromthecache.Thiswouldlooklike:
Cache-Control: public, no-cache

Whetherornotthisisdone,itsbesttominimizeuseofauthentication;forexample,if yourimagesarenotsensitive,puttheminaseparatedirectoryandcongureyour servernottoforceauthenticationforit.Thatway,thoseimageswillbenaturally cacheable.

SHOULD I WORRY ABOUT SECURITY IF PEOPLE ACCESS MY SITE THROUGH A CACHE?


SSLpagesarenotcached(ordecrypted)byproxycaches,soyoudonthavetoworry aboutthat.However,becausecachesstorenonSSLrequestsandURLsfetched throughthem,youshouldbeconsciousaboutunsecuredsites;anunscrupulous administratorcouldconceivablygatherinformationabouttheirusers,especiallyinthe URL. Infact,anyadministratoronthenetworkbetweenyourserverandyourclientscould gatherthistypeofinformation.OneparticularproblemiswhenCGIscriptsput usernamesandpasswordsintheURLitself;thismakesittrivialforotherstondand usetheirlogin. IfyoureawareoftheissuessurroundingWebsecurityingeneral,youshouldnthave anysurprisesfromproxycaches.

8 of 14

12/4/2012 1:24 AM

Caching Tutorial for Web Authors and Webmasters

http://www.mnot.net/cache_docs/#WORK

IM LOOKING FOR AN INTEGRATED WEB PUBLISHING SOLUTION. WHICH ONES ARE CACHE-AWARE?
Itvaries.Generallyspeaking,themorecomplexasolutionis,themoredicultitisto cache.Theworstareoneswhichdynamicallygenerateallcontentanddontprovide validators;theymaynotbecacheableatall.Speakwithyourvendorstechnicalsta formoreinformation,andseetheImplementationnotesbelow.

MY IMAGES EXPIRE A MONTH FROM NOW, BUT I NEED TO CHANGE THEM IN THE CACHES NOW!
TheExpiresheadercantbecircumvented;unlessthecache(eitherbrowserorproxy) runsoutofroomandhastodeletetherepresentations,thecachedcopywillbeused untilthen. Themosteectivesolutionistochangeanylinkstothem;thatway,completelynew representationswillbeloadedfreshfromtheoriginserver.Rememberthatanypage thatreferstotheserepresentationswillbecachedaswell.Becauseofthis,itsbestto makestaticimagesandsimilarrepresentationsverycacheable,whilekeepingthe HTMLpagesthatrefertothemonatightleash. Ifyouwanttoreloadarepresentationfromaspeciccache,youcaneitherforcea reload(inFirefox,holdingdownshiftwhilepressingreloadwilldothisbyissuinga Pragma: no-cacherequestheader)whileusingthecache.Or,youcanhavethe cacheadministratordeletetherepresentationthroughtheirinterface.

I RUN A WEB HOSTING SERVICE. HOW CAN I LET MY USERS PUBLISH CACHE-FRIENDLY PAGES?
IfyoureusingApache,considerallowingthemtouse.htaccesslesandproviding appropriatedocumentation. Otherwise,youcanestablishpredeterminedareasforvariouscachingattributesin eachvirtualserver.Forinstance,youcouldspecifyadirectory/cache1mthatwillbe cachedforonemonthafteraccess,anda/nocacheareathatwillbeservedwith headersinstructingcachesnottostorerepresentationsfromit. Whateveryouareabletodo,itisbesttoworkwithyourlargestcustomersrston caching.Mostofthesavings(inbandwidthandinloadonyourservers)willbe realizedfromhighvolumesites.

IVE MARKED MY PAGES AS CACHEABLE, BUT MY BROWSER KEEPS REQUESTING THEM ON EVERY REQUEST. HOW DO I FORCE THE CACHE TO KEEP REPRESENTATIONS OF THEM?
Cachesarentrequiredtokeeparepresentationandreuseit;theyreonlyrequiredto notkeeporusethemundersomeconditions.Allcachesmakedecisionsaboutwhich representationstokeepbasedupontheirsize,type(e.g.,imagevs.html),orbyhow muchspacetheyhavelefttokeeplocalcopies.Yoursmaynotbeconsideredworth keepingaround,comparedtomorepopularorlargerrepresentations. Somecachesdoallowtheiradministratorstoprioritizewhatkindsofrepresentations arekept,andsomeallowrepresentationstobepinnedincache,sothattheyre alwaysavailable.

Generallyspeaking,itsbesttousethelatestversionofwhateverWebserveryouve chosentodeploy.Notonlywilltheylikelycontainmorecachefriendlyfeatures,new versionsalsousuallyhaveimportantsecurityandperformanceimprovements.

APACHE HTTP SERVER


Apacheusesoptionalmodulestoincludeheaders,includingbothExpiresandCache Control.Bothmodulesareavailableinthe1.2orgreaterdistribution. ThemodulesneedtobebuiltintoApache;althoughtheyareincludedinthe

9 of 14

12/4/2012 1:24 AM

Caching Tutorial for Web Authors and Webmasters

http://www.mnot.net/cache_docs/#WORK

distribution,theyarenotturnedonbydefault.Tondoutifthemodulesareenabled inyourserver,ndthehttpdbinaryandrunhttpd -l;thisshouldprintalistofthe availablemodules(notethatthisonlylistscompiledinmodules;onlaterversionsof Apache,usehttpd -Mtoincludedynamicallyloadedmodulesaswell).Themodules werelookingforaremod_expiresandmod_headers. Iftheyarentavailable,andyouhaveadministrativeaccess,youcanrecompile Apachetoincludethem.Thiscanbedoneeitherbyuncommentingthe appropriatelinesintheCongurationle,orusingthe-enablemodule=expiresand-enable-module=headersargumentstocongure(1.3 orgreater).ConsulttheINSTALLlefoundwiththeApachedistribution. OnceyouhaveanApachewiththeappropriatemodules,youcanusemod_expiresto specifywhenrepresentationsshouldexpire,eitherin.htaccesslesorintheservers access.confle.Youcanspecifyexpiryfromeitheraccessormodicationtime,and applyittoaletypeorasadefault.Seethemoduledocumentationformore information,andspeakwithyourlocalApacheguruifyouhavetrouble. ToapplyCache-Controlheaders,youllneedtousethemod_headersmodule, whichallowsyoutospecifyarbitraryHTTPheadersforaresource.Seethe mod_headersdocumentation. Heresanexample.htaccesslethatdemonstratestheuseofsomeheaders. .htaccesslesallowwebpublisherstousecommandsnormallyonlyfoundin congurationles.Theyaectthecontentofthedirectorytheyreinandtheir subdirectories.Talktoyourserveradministratortondoutiftheyreenabled.
### activate mod_expires ExpiresActive On ### Expire .gif's 1 month from when they're accessed ExpiresByType image/gif A2592000 ### Expire everything else 1 day from when it's last modified ### (this uses the Alternative syntax) ExpiresDefault "modification plus 1 day" ### Apply a Cache-Control header to index.html <Files index.html> Header append Cache-Control "public, must-revalidate" </Files>

Notethatmod_expiresautomaticallycalculatesandinsertsaCacheControl:max-ageheaderasappropriate. Apache2scongurationisverysimilartothatof1.3;seethe2.2mod_expiresand mod_headersdocumentationformoreinformation.

MICROSOFT IIS
MicrosoftsInternetInformationServermakesitveryeasytosetheadersina somewhatexibleway.Notethatthisisonlypossibleinversion4oftheserver,which willrunonlyonNTServer. Tospecifyheadersforanareaofasite,selectitintheAdministration Tools interface,andbringupitsproperties.AfterselectingtheHTTP Headerstab,you shouldseetwointerestingareas;Enable Content ExpirationandCustom HTTP headers.Therstshouldbeselfexplanatory,andthesecondcanbeusedto applyCacheControlheaders. SeetheASPsectionbelowforinformationaboutsettingheadersinActiveServer Pages.ItisalsopossibletosetheadersfromISAPImodules;refertoMSDNfor details.

NETSCAPE/IPLANET ENTERPRISE SERVER


Asofversion3.6,EnterpriseServerdoesnotprovideanyobviouswaytosetExpires headers.However,ithassupportedHTTP1.1featuressinceversion3.0.Thismeans thatHTTP1.1caches(proxyandbrowser)willbeabletotakeadvantageofCache

10 of 14

12/4/2012 1:24 AM

Caching Tutorial for Web Authors and Webmasters

http://www.mnot.net/cache_docs/#WORK

Controlsettingsyoumake. TouseCacheControlheaders,chooseContent Management | Cache Control Directivesintheadministrationserver.Then,usingtheResourcePicker,choose thedirectorywhereyouwanttosettheheaders.Aftersettingtheheaders,clickOK. Formoreinformation,seetheNESmanual.

Becausetheemphasisinserversidescriptingison dynamiccontent,itdoesntmakeforvery cacheablepages,evenwhenthecontentcouldbe cached.Ifyourcontentchangesoften,butnoton everypagehit,considersettingaCacheControl: maxageheader;mostusersaccesspagesagainin arelativelyshortperiodoftime.Forinstance, whenusershitthebackbutton,ifthereisntany validatororfreshnessinformationavailable,theyll havetowaituntilthepageisredownloadedfrom theservertoseeit.

Onethingtokeepin mindisthatitmaybe easiertosetHTTP headerswithyour Webserverrather thaninthescripting language.Tryboth.

CGI
CGIscriptsareoneofthemostpopularwaystogeneratecontent.Youcaneasily appendHTTPresponseheadersbyaddingthembeforeyousendthebody;MostCGI implementationsalreadyrequireyoutodothisfortheContent-Typeheader.For instance,inPerl;
#!/usr/bin/perl print "Content-type: text/html\n"; print "Expires: Thu, 29 Oct 1998 17:04:19 GMT\n"; print "\n"; ### the content body follows...

Sinceitsalltext,youcaneasilygenerateExpiresandotherdaterelatedheaders withinbuiltfunctions.ItseveneasierifyouuseCache-Control: max-age;


print "Cache-Control: max-age=600\n";

Thiswillmakethescriptcacheablefor10minutesaftertherequest,sothatiftheuser hitsthebackbutton,theywontberesubmittingtherequest. TheCGIspecicationalsomakesrequestheadersthattheclientsendsavailableinthe environmentofthescript;eachheaderhasHTTP_prependedtoitsname.So,ifa clientmakesanIf-Modified-Sincerequest,itwillshowupas HTTP_IF_MODIFIED_SINCE. Seealsothecgi_buerlibrary,whichautomaticallyhandlesETaggenerationand validation,Content-LengthgenerationandgzipcontentcodingforPerlandPython CGIscriptswithaonelineinclude.ThePythonversioncanalsobeusedtowrap arbitraryCGIscriptswith.

SERVER SIDE INCLUDES


SSI(oftenusedwiththeextension.shtml)isoneoftherstwaysthatWebpublishers wereabletogetdynamiccontentintopages.Byusingspecialtagsinthepages,a limitedformofinHTMLscriptingwasavailable. MostimplementationsofSSIdonotsetvalidators,andassucharenotcacheable. However,ApachesimplementationdoesallowuserstospecifywhichSSIlescanbe cached,bysettingthegroupexecutepermissionsontheappropriateles,combined withtheXbitHack fulldirective.Formoreinformation,seethemod_include documentation.

11 of 14

12/4/2012 1:24 AM

Caching Tutorial for Web Authors and Webmasters

http://www.mnot.net/cache_docs/#WORK

PHP
PHPisaserversidescriptinglanguagethat,whenbuiltintotheserver,canbeusedto embedscriptsinsideapagesHTML,muchlikeSSI,butwithafarlargernumberof options.PHPcanbeusedasaCGIscriptonanyWebserver(UnixorWindows),oras anApachemodule. Bydefault,representationsprocessedbyPHParenotassignedvalidators,andare thereforeuncacheable.However,developerscansetHTTPheadersbyusingthe Header()function. Forexample,thiswillcreateaCacheControlheader,aswellasanExpiresheader threedaysinthefuture:
<?php Header("Cache-Control: must-revalidate"); $offset = 60 * 60 * 24 * 3; $ExpStr = "Expires: " . gmdate("D, d M Y H:i:s", time() + $offset) . " GMT"; Header($ExpStr); ?>

RememberthattheHeader()functionMUSTcomebeforeanyotheroutput. Asyoucansee,youllhavetocreatetheHTTPdateforanExpiresheaderbyhand; PHPdoesntprovideafunctiontodoitforyou(althoughrecentversionshavemade iteasier;seethePHPsdatedocumentation).Ofcourse,itseasytosetaCacheControl: max-age header,whichisjustasgoodformostsituations. Formoreinformation,seethemanualentryforheader. Seealsothecgi_buerlibrary,whichautomaticallyhandlesETaggenerationand validation,Content-LengthgenerationandgzipcontentcodingforPHPscripts withaonelineinclude.

COLD FUSION
ColdFusion,byMacromediaisacommercialserversidescriptingengine,with supportforseveralWebserversonWindows,LinuxandseveralavorsofUnix. ColdFusionmakessettingarbitraryHTTPheadersrelativelyeasy,withtheCFHEADER tag.Unfortunately,theirexampleforsettinganExpiresheader,asbelow,isabit misleading.
<CFHEADER NAME="Expires" VALUE="#Now()#">

Itdoesntworklikeyoumightthink,becausethetime(inthiscase,whentherequest ismade)doesntgetconvertedtoaHTTPvaliddate;instead,itjustgetsprintedasa representationofColdFusionsDate/Timeobject.Mostclientswilleitherignoresuch avalue,orconvertittoadefault,likeJanuary1,1970. However,ColdFusiondoesprovideadateformattingfunctionthatwilldothejob; GetHttpTimeString.IncombinationwithDateAdd,itseasytosetExpiresdates; here,wesetaheadertodeclarethatrepresentationsofthepageexpireinonemonth;


<cfheader name="Expires" value="#GetHttpTimeString(DateAdd('m', 1, Now()))#">

YoucanalsousetheCFHEADERtagtosetCache-Control: max-ageandother headers. RememberthatWebserverheadersarepassedthroughinsomedeploymentsofCold Fusion(suchasCGI);checkyourstodeterminewhetheryoucanusethistoyour advantage,bysettingheadersontheserverinsteadofinColdFusion.

12 of 14

12/4/2012 1:24 AM

Caching Tutorial for Web Authors and Webmasters

http://www.mnot.net/cache_docs/#WORK

ASP AND ASP.NET


ActiveServerPages,builtintoIISandalso availableforotherWebservers,alsoallowsyouto setHTTPheaders.Forinstance,tosetanexpiry time,youcanusethepropertiesoftheResponse object;
<% Response.Expires=1440 %>

WhensettingHTTP headersfromASPs, makesureyoueither placetheResponse methodcallsbefore anyHTML generation,oruse Response.Buffer tobuertheoutput. Also,notethatsome versionsofIISseta Cache-Control: privateheaderon mustbedeclared publictobecacheable bysharedcaches.

specifyingthenumberofminutesfromtherequest toexpiretherepresentation.Cache-Control headerscanbeaddedlikethis:


<% Response.CacheControl="public" %>

InASP.NET,Response.Expiresisdeprecated; theproperwaytosetcacherelatedheadersis withResponse.Cache;

Response.Cache.SetExpires ( DateTime.Now.AddMinutes ( 60 ) ) ; ASPsbydefault,and Response.Cache.SetCacheability ( HttpCacheability.Public ) ;

HTTP 1.1 SPECIFICATION


TheHTTP1.1spechasmanyextensionsformakingpagescacheable,andisthe authoritativeguidetoimplementingtheprotocol.Seesections13,14.9,14.21,and 14.25.

WEB-CACHING.COM
Anexcellentintroductiontocachingconcepts,withlinkstootheronlineresources.

ON INTERPRETING ACCESS STATISTICS


JeGoldbergsinformativerantonwhyyoushouldntrelyonaccessstatisticsandhit counters.

REDBOT
ExaminesHTTPresourcestodeterminehowtheywillinteractwithWebcaches,and generallyhowwelltheyusetheprotocol.

CGI_BUFFER LIBRARY
OnelineincludeinPerlCGI,PythonCGIandPHPscriptsautomaticallyhandles ETaggenerationandvalidation,ContentLengthgenerationandgzipContent Encodingcorrectly.ThePythonversioncanalsobeusedasawrapperaround arbitraryCGIscripts.

ThisdocumentisCopyright19982012MarkNottingham<mnot@mnot.net>.This workislicensedunderaCreativeCommonsAttributionNoncommercialNo DerivativeWorks3.0UnportedLicense. Alltrademarkswithinarepropertyoftheirrespectiveholders. Althoughtheauthorbelievesthecontentstobeaccurateatthetimeofpublication, noliabilityisassumedforthem,theirapplicationoranyconsequencesthereof.Ifany misrepresentations,errorsorotherneedforclaricationisfound,pleasecontactthe authorimmediately.

13 of 14

12/4/2012 1:24 AM

Caching Tutorial for Web Authors and Webmasters

http://www.mnot.net/cache_docs/#WORK

Thelatestrevisionofthisdocumentcanalwaysbeobtainedfrom http://www.mnot.net/cache_docs/ Translationsareavailablein:Belarusian,Chinese,Czech,German,andFrench.


February9,2012

14 of 14

12/4/2012 1:24 AM

Vous aimerez peut-être aussi