Vous êtes sur la page 1sur 7

OverlayNetworks Fromitsinception,theInternethasadoptedacleanmodel,inwhichtheroutersinside thenetworkareresponsibleforforwardingpacketsfromsourcetodestination,and applicationprogramsrunonthehostsconnectedtotheedgesofthenetwork.The client/server paradigm illustrated by the applications discussed in the first two sectionsofthischaptercertainlyadherestothismodel.

thismodel. In the last few years, however, the distinction between packet forwarding and applicationprocessinghasbecomelessclear.Newapplicationsarebeingdistributed acrosstheInternet,andinmanycases,theseapplicationsmaketheirownforwarding

decisions. These new hybrid applications can sometimes be implemented by extendingtraditionalroutersandswitchestosupportamodestamountofapplication specific processing.Forexample,socalled level7switches sitinfrontofserver clustersandforwardHTTPrequeststoaspecificserverbasedontherequestedURL. However, overlaynetworks arequicklyemergingasthemechanismofchoicefor introducingnewfunctionalityintotheInternet. Youcanthinkofanoverlayasalogicalnetworkimplementedontopofaphysical network.Bythisdefinition,theInternetitselfisanoverlaynetwork,whichis,infact, atruestatement.Figure9.17depictsanoverlayimplementedontopofanunderlying network.Eachnodeintheoverlayalsoexistsintheunderlyingnetwork;itprocesses and forwards packets in an applicationspecific way. The links that connect the overlaynodesareimplementedastunnelsthroughtheunderlyingnetwork.Multiple

overlay networks can exist on top of the same underlying networkeach implementingtheirownapplicationspecificbehaviorandoverlayscanbenested, oneontopofanother.Forexample,alloftheexampleoverlaynetworksdiscussedin thissectiontreattodaysInternetastheunderlyingnetwork. We have already seen examples of tunneling, for example, to implement virtual privatenetworks(VPNs).Asabriefrefresher,thenodesoneitherendofatunnel treatthemultihoppathbetweenthemasasinglelogicallink,wherethenodesthatare

tunneledthroughforwardpacketsbasedontheouterheader,neverawarethatthe end nodes have attached an inner header. For example, Figure 9.18 shows three overlaynodes(A,B,andC)connectedbyapairoftunnels.Inthisexample,overlay nodeBmightmakeaforwardingdecisionforpacketsfromAtoCbasedontheinner header (IHdr ), and then attach anouter header (OHdr ) that identifies C as the destinationintheunderlyingnetwork.NodesA,B,andCareabletointerpretboth theinnerandouterheaders,whereastheintermediateroutersunderstandonlythe outerheader.Similarly,A,B,andChaveaddressesinboththeoverlaynetworkand theunderlyingnetwork,but theyarenotnecessarilythesame;forexample,their underlyingaddressmightbea32bitIPaddress,whiletheiroverlayaddressmightbe an experimental 128bit address. In fact, the overlay need not use conventional addressesatall,butmayroutebasedonURLs,domainnames,anXMLquery,or eventhecontentofthepacket. RoutingOverlays Thesimplestkindofoverlayisonethatexistspurelytosupportanalternativerouting strategy;noadditionalapplicationlevelprocessingisperformedattheoverlaynodes. Youcanviewavirtualprivatenetwork(seeSection4.1.8)asanexampleofarouting overlay,butonethatdoesntsomuchdefineanalternativestrategyoralgorithmasit definesalternativeroutingtableentriestobeprocessedbythestandardIPforwarding algorithm.Inthisparticularcase,theoverlayissaidtouseIPtunnels,andthe ability to utilize these VPNs is supported in most commercial routers. Suppose, however,youwantedtousearoutingalgorithmthatcommercialroutervendorswere

notwillingtoincludeintheirproducts.Howwouldyougoaboutdoingit?Youcould simplyrunyouralgorithmonacollectionofendhostsandtunnelthroughtheInternet routers.Thesehostswouldbehavelikeroutersintheoverlaynetwork:Ashoststhey arelikelyconnectedtotheInternetbyonlyonephysicallink,butasanodeinthe overlaytheywouldbeconnectedtomultipleneighborsviatunnels.Sinceoverlays, almostbydefinition,areawaytointroducenewtechnologiesindependentofthe standardizationprocess,therearenostandardoverlayswecanpointtoasexamples. Instead, we illustrate the general idea of routing overlays by describing several experimental systems recently proposed by network researchers. Experimental VersionsofIPOverlaysareidealfordeployingexperimentalversionsofIPthatyou hopewilleventuallytakeovertheworld.Forexample,IPmulticastisanextensionto IP that interprets class D addresses (those with the prefix 1110 ) as multicast addresses. IP multicast is used in conjunction with one of the multicast routing protocols,suchasDVMRP,describedinSection4.4. TheMBone(multicastbackbone)isanoverlaynetworkthatimplementsIPmulticast. OneofthemostpopularapplicationsrunontopoftheMBoneis vic ,atoolthat supportsmultipartyvideoconferencing. vic isusedtobroadcastbothseminarsand meetingsacrosstheInternet.Forexample,IETFmeetingswhichareaweeklong andattractthousandsofparticipantsaregenerallybroadcastovertheMBone. LikeVPNs,theMBoneusesbothIPtunnelsandIPaddresses,butunlikeVPNs,the MBoneimplementsadifferentforwardingalgorithmitforwards packetstoall downstreamneighborsinthe shortestpathmulticasttree.Asanoverlay,multicast awarerouterstunnelthroughlegacyrouters,withthehopethatonedaytherewillbe nomorelegacyrouters. The6BoneisasimilaroverlaythatisusedtoincrementallydeployIPv6.Likethe MBone,the6BoneusestunnelstoforwardpacketsthroughIPv4routers.Unlikethe MBone,however,6BonenodesdonotsimplyprovideanewinterpretationofIPv4s 32bitaddresses.Instead,theyforwardpacketsbasedonIPv6s128bitaddressspace. Moreover,sinceIPv6supportsmulticast,sodoesthe6Bone. EndSystemMulticast AlthoughtheMBoneremainsapopularoverlay,IPmulticasthasfailedtotake overtheworld,andinresponse,multicastbasedapplicationslikevideoconferencing haverecentlyturnedtoanalternativestrategy,calledendsystemmulticast.Theidea ofendsystemmulticastistoacceptthatIPmulticastwillneverbecomeubiquitous, andtoinsteadlettheendhoststhatareparticipatinginaparticularmulticastbased applicationimplementtheirownmulticast trees.(Asanaside,thereisaschoolof thoughtthatsaysIPmulticastnevertookoffbecauseitsimplydoesntbelongatthe networklayer,sinceitmustsupporthighlayerfunctionalitysuchaserror,flow,and congestioncontrol,aswellasmembershipmanagement.) Beforedescribinghowendsystemmulticastworks,itisimportanttofirstunderstand that,unlikeVPNsandtheMBone,endsystemmulticastassumesthatonlyInternet hosts(asopposedtoInternetrouters)participateintheoverlay.Moreover,thesehosts

typicallyexchangemessageswitheachother throughUDPtunnelsrather thanIP tunnels,makingiteasytoimplementasregularapplicationprograms.Thismakesit possibletoviewtheunderlyingnetworkasafullyconnectedgraph,sinceeveryhost intheInternetisabletosendamessagetoeveryotherhost.Abstractly,then, end systemmulticastsolvesthefollowingproblem:Startingwithafullyconnected

graphrepresentingtheInternet,thegoalistofindtheembeddedmulticasttreethat spansallthegroupmembers. SincewetaketheunderlyingInternettobefullyconnected,anaivesolutionwouldbe tohaveeachsourcedirectlyconnectedtoeachmemberofthegroup.Inotherwords, end system multicast could be implemented by having each node send unicast messages to every group member. To see the problem in doing this, especially comparedtoimplementingIPmulticastinrouters,considertheexampletopologyin

Figure9.19.Figure9.19(a)depictsanexamplephysicaltopology,whereR1andR2 areroutersconnectedbyalowbandwidthtranscontinentallink;A,B,C,andDare end hosts;andlinkdelaysaregivenasedgeweights.AssumingAwantstosenda multicast messagetotheotherthreehosts,Figure9.19(b)showshownaiveunicast transmissionwouldwork.Thisisclearlyundesirablebecausethesamemessagemust traversethelinkAR1threetimes,andtwocopiesofthemessagetraverseR1R2. Figure 9.19(c) depicts the IP multicast tree constructed by DVMRP. Clearly, this approach eliminates the redundant messages.Without support from the routers, however,thebestyoucanhopeforwithendsystemmulticastisatreesimilartothe one shown in Figure 9.19(d). End system multicast defines an architecture for constructingthistree. Thegeneralapproachistosupportmultiplelevelsofoverlaynetworks,eachofwhich extractsasubgraphfromtheoverlaybelowit,untilwehaveselectedthesubgraphthat theapplicationexpects.Forendsystemmulticastinparticular,thishappensintwo stages: First we construct a simple mesh overlay on top of the fully connected Internet,andthenweselectamulticasttreewithinthismesh.Theideaisillustratedin Figure9.20,againassumingthefourendhostsA,B,C,andD.Thefirststepisthe criticalone:Oncewehaveselectedasuitablemeshoverlay,wesimplyrunastandard multicastroutingalgorithm(e.g.,DVMRP)ontopofittobuildthemulticasttree.We alsohavetheluxuryofignoringthescalabilityissuethatInternetwidemulticastfaces

sincetheintermediatemeshcanbeselectedtoincludeonlythosenodesthatwantto participateinaparticularmulticastgroup. Thekeytoconstructingtheintermediatemeshoverlayistoselectatopology that roughlycorrespondstothephysicaltopologyoftheunderlyingInternet,butwehave todothiswithoutanyonetellinguswhattheunderlyingInternetactuallylookslike sincewearerunningonlyonendhostsandnotrouters.Thegeneralstrategyisforthe endhoststomeasuretheroundtriplatencytoothernodesandtodecidetoaddlinks tothemeshonlywhentheylikewhattheysee.Thisworksasfollows.First,assuming ameshalreadyexists,eachnodeexchangesthelistofallothernodesitbelievesis partofthemeshwithitsdirectlyconnectedneighbors.Whenanodereceivessucha membershiplistfromaneighbor,itincorporatesthatinformationintoitsmembership list and forwards the resulting list to its neighbors. This information eventually propagatesthroughthemesh,muchasinadistancevectorroutingprotocol.Whena hostwantstojointhemulticastoverlay,itmustknowtheIPaddressofatleastone othernodealreadyintheoverlay.Itthensendsajoinmeshmessagetothisnode. Thisconnectsthenewnodetothemeshbyanedgetotheknownnode. Ingeneral,thenewnodemightsendajoinmessagetomultiplecurrentnodes,thereby joiningthemeshbymultiplelinks.Onceanodeisconnectedtothemeshbyasetof links,itperiodicallysendskeepalivemessagestoitsneighbors,lettingitknowthat itstillwantstobepartofthegroup.Whenanodeleavesthegroup,itsendsaleave meshmessagetoitsdirectlyconnectedneighbors,andthisinformationispropagated totheothernodesinthemeshviathemembershiplistdescribedabove.Alternatively, anodecanfail,orjustsilentlydecidetoquitthegroup,inwhichcaseitsneighbors detectthatitisnolongersendingkeepalivemessages.Somenodedepartureshave little effect on the mesh, but should a node detect that the mesh has become partitionedduetoadepartingnode,itcreatesanewedgetoanodeintheother partition by sending it a join mesh message. Note that multiple neighbors can simultaneouslydecidethatapartitionhasoccurredinthemesh,leadingtomultiple crosspartitionedgesbeingaddedtothemesh. Asdescribedsofar,wewillendupwithameshthatisasubgraphoftheoriginal fullyconnectedInternet,butitmayhavesuboptimalperformancebecause(1)initial neighborselectionaddsrandomlinkstothetopology,(2)partitionrepairmightadd edges that are essential at the moment but not useful in the long run, (3) group membershipmaychangeduetodynamicjoinsanddepartures,and(4)underlying network conditions may change. What needs to happen is that the system must evaluatethevalueofeachedge,resultinginnewedgesbeingaddedtothemeshand existingedgesbeingremovedovertime. Toaddnewedges,eachnodeiperiodicallyprobessomerandommemberjthatitis notcurrentlyconnectedtointhemesh,measurestheroundtriplatencyofedge(i,j ),andthenevaluatestheutilityofaddingthisedge.Iftheutilityisaboveacertain threshold,link(i,j)isaddedtothemesh.Evaluatingtheutilityofaddingedge(i,j )mightlooksomethinglikethis:

EvaluateUtility(j) utility=0 foreachmembermnotequaltoi CL=currentlatencytonodemalongroutethroughmesh NL=newlatencytonodemalongmeshifedge(i,j)isadded if(NL<CL)then utility+=(CLNL)/CL returnutility Decidingtoremoveanedgeissimilar,excepteachnodeicomputesthecostof eachlinktocurrentneighborjasfollows: EvaluateCost(j) Costij=numberofmembersforwhichiusesjasnexthop Costji=numberofmembersforwhichjusesiasnexthop returnmax(Costij,Costji) Itthenpickstheneighborwiththelowestcostanddropsitifthecostfallsbelowa certainthreshold. Finally, since the mesh is maintained using what is essentially a distancevector protocol,itistrivialtorunDVMRPtofindanappropriatemulticasttreeinthemesh. Notethatalthoughitisnotpossibletoprovethattheprotocoljustdescribedresults intheoptimummeshnetwork,therebyallowingDVMRPtoselectthebestpossible multicasttree,bothsimulationandextensivepracticalexperiencesuggestthatitdoes agoodjob. ResilientOverlayNetworks Anotherroutingoverlaygaininginpopularityisonethatfindsalternativeroutesfor traditionalunicastapplications.Suchoverlaysexploittheobservationthatthetriangle inequalitydoesnotholdintheInternet.Figure9.21illustrateswhatwemeanbythis. ItisnotuncommontofindthreesitesintheInternetcallthemA,B,andCsuch thatthelatencybetweenAandBisgreaterthanthesumofthelatenciesfromAtoC andfromCtoB.Thatis,sometimesyouwouldbebetteroffindirectlysendingyour packetsviasomeintermediatenodethansendingthemdirectlytothedestination. Howcanthisbe?Well,BGPneverpromisedthatitwouldfindthe shortest route betweenanytwosites;itonlytriestofindsomeroute.Tomakemattersworse,there are countless opportunities for humandirected policies to override BGPs normal operation. This often happens, for example, at peering points between major backboneISPs.Inshort,thatthetriangleinequalitydoesnotholdintheInternet shouldnotcomeasasurprise. Howdoweexploitthisobservation?Thefirststepistorealizethatthereisa fundamentaltradeoffbetweenthescalabilityandoptimalityofaroutingalgorithm.

Vous aimerez peut-être aussi