Vous êtes sur la page 1sur 25

METHODSINDATAVISUALIZATION:AguidefortheEnergyprofessional

JohnMaxwellandNatalieBallew August2013

INTRODUCTION
Thepurposeofthispaperistoexaminethefieldofinformationvisualization,anddefinethewaysandextentto whichanEnergyandEarthResource(EER)professionalwouldbenefitfromimploringtheeffectivemethodsof presentingdatavisually.Asissuesofnaturalresourcesandeconomicsbecomemorecomplexintheirinteractions andoutcomes,simpledatatableswillnolongersufficetocommunicateconcernsorconclusions.Theabilityto takerawdataandtranslateitintoameaningfulargumentisthetruetestofaprofessionalwhosejobitisto providedecisionsupport.EERprofessionalsworkdirectlywithdecisionmakers;knowinghowtoframeissuesand evidencewithastronglinktodataisanextremelyvaluableskilltobringtofactbaseddecisionmaking.Exploring howtousedatatocommunicateaconcern,ratherthanjusttoshowdata(Lima)canproveusefultoanEER professional. Thevisualdisplayofinformationisnotanewidea.Hieroglyphicsandcavedrawingswereamongthefirst

examples,packingdescriptions,stories,andknowledgeintosimple,easilyunderstooddrawings.AstronomersCarl SaganandFrankDrakecreatedagraphictocommunicateacrossallformsofintelligentlifethatwasattachedto thePioneerspacecraftin1972(Figure1).Whileitisunknownwhetherotherformsoflifewouldunderstandthe graphic,thedesignelementswithinaresimple,usingline,proportion,andproximitytodescribethelayoutofthe solarsystem,relativesizeofthespacecrafttoahumanbeing,andthehydrogenatom.

Figure1NASAimageofPioneer10plaque,1972

Aneffectivevisualinformationaldisplaycansqueezemultiplelevelsofinformationintoasinglegraphical

representationthatiseasiertounderstandthanspreadsheetsofdataoralongstringofwords.AsGalileostatedin 1610(viaEdwardTuftesBeautifulEvidence),thedisputeswhichforsomanygenerationshavevexedare destroyedbyvisiblecertainty,andweareliberatedfromwordyarguments.Advancedcomputingpowerandthe exponentialexpansionofavailabledatamakestheideaofvisiblecertaintyamuchmorepowerfulconceptthan wordyarguments.Anobservationunaccompaniedbyvisualevidenceisnotreadilywelcomedthesedays,sothe abilitytoharnessthedataandturnitintoamoretangiblepieceofinformationhasgreatstrengthin communicatingconcerns. EERhasavastlandscapeofavailabledatathatlendwelltovisualization.Fromcrudeoilspotpricesto

stratigraphiccolumnstogroundwatermodeloutputs,analysisofdataandsystemsrequireakeeneyeandagrasp ofthebasicsofvisualizationtheory.Thispaperwillcoverseveralconceptsfromauthorsinvisualization techniques,andfoundationalprinciplesandguidelinesforbestdatavisualizationpractices.Thenextfew paragraphswillcoverintroductionstotheseconcepts.PartIIofthepaperwillcovertwospecificEERtopicsas examplesofhowtoapplysomeoftheseprinciples.PartIIIframestheimportancetotheEERprofessionaltohave datavisualizationskills,andwillhopefullyencouragedeepthoughtabouthowotherdatashouldberepresented visually.

PARTI:KEYCONCEPTS,PRINCIPLES,ANDGUIDELINESTODATAVISUALIZATION
Tobeginthinkingaboutdataanditsconnectiontoenergyandearthresourcesfields,thefollowingtwoquotes illustratehowdatacanbethoughtofasanaturalelementintodaysworld. Informationgentlybutrelentlesslydrizzlesdownonusinaninvisible,impalpableelectricrain. HansChristianvonBaeyer,Information:TheNewLanguageofScience Todayweliveinvestedwithanelectricinformationenvironmentthatisquiteasimperceptible tousaswateristoafish. MarshallMcLuhan,Counterblast Thequotes(alsocitedinVisualComplexity:MappingPatternsofInformationbyManuelLima)sparkthinkingon theubiquitousnatureofdataandthenecessityandimportanceofdatavisualization. Dataisintegratedintoaspectsofeverydaylife.Humansareconstantlygatheringawealthofnew informationfromsocialinteractions,nature,thesurroundingenvironment,andtechnology.Theabilitytosort throughalargeamountofdata,provideaconduitfortheinformationtopassthrough,andtoframeandanchoran argumenttoalogicalconclusionattheendoftheconduitisanemergingnecessityfortheEERprofessional. ColinWare,theDirectoroftheDataVisualizationResearchLabattheCenterforCoastalandOcean MappingattheUniversityofNewHampshire,specializesinadvanceddatavisualizationandhasaspecialinterest inapplicationsofvisualizationforoceanmapping.WaredescribesvisualizationinhisbookInformation Visualizationasanexternalartifactwhichsupportsdecisionmaking.Visualizationscanprovideanabilityto

comprehendhugeamountsofdata,allowfortheperceptionofemergentpropertiesthatwerenotanticipated, facilitatehypothesisformation,andrevealqualitiesnotonlyaboutthedataitself,butalsoaboutthewayitwas collected(Ware).However,poorlydesignedvisualizationscandistractfromthebenefitsofawellexecuted visualization.Detailsonhowtoavoidthedownfallsofanexcellentvisualizationwillbediscussedlater. Partofwhatmakesagoodvisualizationishowtheimageisphysicallyprocessedbythebraintakingin information.Informationthatcandeliversubstantiateddataforpolicydecisionsismostusefulwhenintegrated intoaprocessthatleveragesthecapabilitiesofthe[human]visualsystemtomoveahugeamountof informationintothebrainveryquickly.

Figure2Thevisualizationprocess.(Ware)

AsshowninFigure2,themovementofinformationfromitsabstractdataformintothebrainsvisual processingunitundergoesaprocessthatshapesthedatainvisualizationtoallowforinformationtomovefrom pointA(abstractdata)topointB(brain)inanexplanatoryframework(Illinksy).Thebestvisualizationsincorporate methodsofgooddesign(anartintensiveangle)andsolidscientific,statistical,andmathematicalmethods(a scienceintensiveangle).Takingstrongpointsfromeachanglecreatesafeedbackloopthattakesrawdataand transformsormanipulatesittocreateatangiblemapofpatterns,connections,andstructuresoutofintangible evidence(Ware/Lima).Effectivelycombiningtheartandscienceaspectsofvisualizationcreatesaclearandstrong pathfordataexploration,manipulation,andbroadercontextandapplication.TheEERprofessionalshould differentiatethemselvesintheareabetweenthedataandthevisualresponseelementsofthereadersbrain.

Makingdataeasytounderstandwithdirectcausalityandcomparisonsmadewilldeliverhighqualityreturnsfor theconceptandevidencethatshouldbeofferedquicklyandaccuratelytodecisionmakers. Thefinelineofscienceandartwithindatavisualizationisbecomingthinnerascomputerprocessing powerprogressesadinfinitumanddeliverstoolsabletoanalyzemultivariateproblemsinamoreapproachable form.AfewofthenumerousprogramsavailablewillbediscussedinPartII.Understandingthetheories,processes, andopportunitiesavailabletotranslaterawdataintodecisionmakingtoolscanproveineffectiveandeven disastrouswithoutholdinguptoaqualitystandardofbasicdesignprinciplesandguidelinesassociatedwith makingvisualdisplaysofinformation. BestPractices:VisualizationCreationWorkflow Insyncwiththerecentprevalenceoflargeamountsofcomplexdata,orbigdata,therehasbeenanexplosionof literatureindatavisualizationandbestpractices.Agraspofthesebestpracticescanmergewithavailabletoolsto createthebestdisplayofdata.MostofthisliteratureisderivedfromtheworkofEdwardTufte,whohaswrittena seriesofbooksondatapresentation.InthebookTheVisualDisplayofQuantitativeInformationTuftedescribes whatmakesgraphicalexcellence,andtheseprinciplesareincorporatedintotheworkflowpresentedhere. Graphicsshouldbepresentedinasimple,yetmultidimensionalwaysothattheviewercanfocusonthe dataandwhatcanemergefromthedata,ratherthanfocusonthemethodsimploredtocreatethegraphic.The firststeptoachievinggraphicalexcellenceistoadheretothebasicdesignprinciplesandelements.Thisworkflow doesnotincorporatethebasicdesignelements,buttheyincludethefollowing.Basicdesignprinciplesare achievedthroughuseoftheelements(alsoseeAppendixI). DesignElements

Line:Graphicalfeaturessuchasaxes,gridlines,tickmarks,etc.shouldbeminimizedtoletthe dataandimportantinformationshinethroughinanygraphic. Color:Ourmindsdonotputanorderonthecolorsoftherainbow,soitismoreeffectivetouse shadesofacolorwhendepictingmagnitudesorimportance.Harshorvibrantcolorscandistract theeyefromimportantinformationinthedata;colorsfoundinnatureareoftenmorepleasing. Shape:Donotuseoverlycaricaturizedimagestorepresentdataastheywillbedistracting. Simpleshapesthatservemultiplepurposes(labelsanddatapoints,forexample)areeffective. Texture:Texturecanbeimploredtoaddtotheinformationinshapes,butshouldbeminimized andusedinasubtlemanner. Space:Spaceisveryimportanttovisualizations.Alargeamountofinformationinasmallarea canallowtheviewertomoreeasilycomparedataandcanpromoteconnectionsbetweenthe data;however,informationthatistootightlyfitcanbedifficulttoread. Form:Formisathreedimensionalaspectthatshouldbeconsideredwhenmakingthree dimensionalinteractivevisualizations.Whenmakingtwodimensionalvisualizations,itisalso importanttoconsiderhowathreedimensionalobjectwillbetranslatedontoatwo dimensionalplane.

TheworkflowpresentinginFigure3incorporatesdesignfunctions,principles,elements,andguidelinesoutlinedin thevariousliteratureavailableondatavisualizationandgraphicalexcellence.

Figure3VisualizationCreationWorkflow.*Designelements,describedearlier,shouldalsobeusedincreation.

Oneofthemostimportantelementsofthedatabehindagraphicisthepresenceofmorethantwo variables.Thesetypesofdisplayshavemoredepthandroomforexpansionthandisplayswithtwovariablesalone. Multivariatedisplaysincludefamiliargraphics:maps,barcharts,scatterplots,linegraphs,etc.Moreeffective displaysincludenarrativegraphicsoftimeandspaceandrelationalgraphics.Narrativegraphicsshowdatamoving overspaceandtimeandisagreatwaytoincorporatealargernumberofvariables.TufteusestheCharlesJoesph MinardgraphicofNapoleons1812Russiacampaignasanexampleofanarrativegraphic(Figure4below).

Figure4Minard'schartofNapoleon's1812RussiaCampaign

ThischarttellsthestoryofNapoleonstroopsjourneytoRussiaandback.Thischartincorporatescomplexitythat includessixvariables,includingtime,geographicallocation,armysize,andtemperature,inasubtleway.Ittellsa storyratherthanjustgivingthedata.Thischartnotonlyfulfillsthesixprinciples,butalsofulfillsthebasic principlesofgraphicalexcellence,givingtheviewerthemostamountofinformation,usingtheleastamountofink (Tufte).Narrativegraphicspresentquantitativeandsometimesqualitativeinformationandleadtheviewerto deductaconclusionandexplorefurtherpotentialsaboutthetopicpresented. Abasicunderstandingofdesignandgraphicsprinciplesallowsfullattentionondatamanipulationand

masteringavailableprocessingtools.ForanyEERprofessional,itisidealtobeabletotakeinalargeamountof dataandpresentitinawaythatiseasytodigestsothatfurtherdiscussiononwhatstorythedataistellingcanbe pursued.ItisimportanttoanEERprofessionaltohavethisskillbecausemuchoftheinformationintheEERfield coversmultipletopics,suchaswater,finance,energy,andcommodities.Reigningoverinterdisciplinarydata requirestheknowledgeofthebestwaytodisplaythedata,andfinalpurposeandaudienceshouldalsobe consideredinthetypeofvisualizationused. Inthenextsection,wewillexplorethemethodsusedwithseveraldifferenttypesofdatarepresentative

oftheEERfield:economicandenergydatavisualizedusingareacharts,parallelsets,andexcelgraphs;anetwork

analysisofbibliographicrecordstodetermineemergenceandprevalenceofaparticulartopic;andatreemap visualizationofgroundwatermodelrunstoviewtheeffectsofdifferentpumpingparameters.Wewilldiscussthe purposeforselectingeachmethod,andwhatworksandwhatcouldbeimprovedwitheachvisualization.

PARTII:EXAMPLEDATASETSANDVISUALIZATIONS
ENERGYANDECONOMICDATA Thepracticeofdisplayingeconomicdataaswellasenergydatais,initself,averylargesubcategoryof thedatavisualizationfield.Thegoalistodisplaythesetwotypesofdatatogetherandtodeterminethebestway toshowalargenumberofcountrydataoverabouta30yeartimeperiod.Theuseoftimeinthiscasebecomesthe primarydifferentiatorandtrendtypetomakecomparisonsamongthedata. Thequestionsthatthedataseektoanswerhavetodowiththewaysinwhichenergyandtheeconomy effecteachotherandcreatefeedbackloops.Thereisvigorousdebateastowhetherthegreatertheconsumption ofenergyresultsingreatereconomicgrowthoriftherelationshipworkstheviceversa,inwhichgreatereconomic growthresultsingreaterenergyuse.Thisrelationshipandinteractionamongdifferenttypesofprimaryenergy sourcesistheprincipleissuethatJohnMaxwellsthesisexplores.UsingtheworkofCareyKingandhis investigationofnetenergymeasuresintheUnitedStates,thedatasetis44countriesoveratimeperiodfrom 1978to2010.Thedatavisualizationwillexploredifferenttrendsandcomparisonsbetweencountriesandthe differingcausalityrelationships.TheTufteanalyticalcriteriaarethefoundationforthewaythedatawillbe displayedandexplained. Thereareseveraldifferentprogramsthatcanbeusedtoshowthisenergyandeconomicinformation. HerethedataisdisplayedinMicrosoftExcel,Tableau,aparallelsetsprogram,andsomeexploratorystepsintoR. Alloftheseprogramscanbeusefulforcreatingthevisualizationofdatathatisrequiredfortheproperandclear displayofinformation. Thefirstdatavisualizationisasimplechartrepresentingthepercentageofgrossdomesticproduct(GDP) spentonenergyandthepercentagechangeinGDPoverthetimebetween1978and2010.Thischartwascreated inMicrosoftExcelandisasimplelinechart.PuttingdataintoExcelinitiallyhelpstolookatatrendforthesetwo measurementsovertime.Thedefaultforthischarthadblueandpurplelines(Figure5),whichdidnotlookgood enoughtocreateadifferentiationofthechangesovertime.Sincecolorisnotnaturallyorderedinthesensethat thereisanaturalruleforredsandbluestocreateahierarchy(AppendixII)itisuptothecreatortodetermine wherethereisaneedtovarythecolor.Theideaofcolorhelpsthereadertodifferentiatethepointsthatthe authoristryingtomake.


Figure5InitialExcellinegraph

Figure6RevisedExcellinegraph

InFigure6,therevisedversionofFigure5,atitlewasadded,thexaxiswasshifted,madelessbusyandlarger.The legendwasdeletedandtitlewasadded.Thelegendforthelineswereaddedandlabeledinthesamecolorasthe lines.AvoidingdefaultsinMicrosoftExcelhasbecomenearlyanironcladrule.Thelabelinginthiscaseisalsoput

ontoptoanchorthefactthatforthemostpart,thepercentageofGDPspentonenergyhasbeengreaterthanthe percentageoftheGDPwhichchanges. ThenextfewfigureswerecreatedinTableauandarealsopartofanoutoftheboxvisualizationsoftware createdexploreandanalyzedatavisually.Tableauusesdraganddroptypesofintuitivevisualizationcreation. Tableauworksverywellforspatialinformationandcanhelptocreatemapsandotherlocationbasedinformation. Thisfirstfigure(Figure6)ispartoftheenergyinformationandwasoneofthefirstattemptsthatthe creatortooktowardscreatingavisualizationinTableau.Itisaverybusygraphandmanylessonshavebeen learnedsincethisfirstattempt.

Figure7TableauVisualization1

Thenextfewgraphicswerecreatedafterresearchinthedatavisualizationfieldandtakingacoupleof Tableaututorials. Figure8isanareachartrepresentingtheamountofexpendituresspentondifferenttypesofenergy sourcesaswellastheamountofenergyconsumedinenergyunitsintheworldbetween1978and2010.Thearea

chartusesdifferentcolorstorepresentthetypesofenergyshowedinthekeytotheright.Thischartemphasizes theamounttheworldspendsoncrudeoilandhowtheamountofenergyhasnotchangeddramatically,whilethe aggregateamountofexpenditureshasincreasedbynearly400percent.

Figure8AreaChart:Worldwideenergyexpendituresandenergyconsumption

Figures10and11areattemptstouseadifferentcolorpaletteandmatchcolorsacrossspaceandtime. ThebubblechartusesareaandcolortodemonstratetheamountofGDPthateachcountryspendsonenergy.The scaleatthetopisthesameinthetwofigures.ThesizeofthebubblesandproximityisusestheGestaltprinciples ofproximity.Arrangementofthecountriesisspacedtonotoverwhelmthereader(Lima).Thefollowingtwo figuresareexamplesofGestaltprinciplesinpracticeandhowtoclassifydifferentaspectsofproximityand arrangement.

Figure9Connectednessisapowerfulgroupingprinciplethatisstrongerthana)proximity,b)color,c)size,ord)shape. Connectednessusingsmoothcontinuouslinesiseasiertounderstandthanabruptlines.

10


Figure10WorldMapEnergy%GDP

Figure11Bubblechartenergy%GDP

11

IfallthelargebubblesinFigure11wererightontopofeachothertherewouldnotbeenoughcontextforthe readertogainmuchappreciationforwhatthedataisexpressing.Thenextleveltothisvisualizationwouldbefor thebubblestobearrangedintheorderofthemap.Themapshowsthecountriesacrosstheworldandhowthere isarelationshipbetweenlocationandhowthecountryisspendingitsGDP.Thisbecomesrelevantsinceoilisthe mosttradedeconomyandtheamountofworldGDPspentoncrudeoilwasthehighestandthereseemstobea correlationbetweentheamountspentonoilandthosecountriesthatareproducingoil. Figure12isavisualizationofaparallelset,whichisawaytovisualizemultiplevariablesandtheir relationshiptoeachother.Thepinkboxiswhatiscalledbrushingandcanshowwheretheaveragevalueslay withinthedataset.Parallelsetsareawaytodisplaylargeamountsofdiscretepatterndata.Inthisplot,thereare severaldimensions,whicharerepresentedbyeachverticalline.Theverticallineisanewdimensiononwhichthe scalechanges.Thistypeofvisualizationcangiveinsightintotheeffectofevaluatingasetofdatabasedonchanges withinthedataset.Theeffectofcolorinthisvisualizationoffersacoupleofbenefitsasthereisastratificationthat carriesalongthelinesthroughoutthedimensionsofthevalues.

Figure12Parallelset

12


Figure13ScatterplotmadeinR

Figure13isabasicscatterplotgraphcreatedinRandaformattedscatterplotmatrixcreatedwiththe xdmvtool,whichwasalsousedtocreatetheparallelsetvisualization.Scatterplotsareanexcellentwaytopresent discretepatterndatacontainingtwodimensions,butcanalsobeusedtorepresentthreedimensionswhenthere isvariationinsizeorcolorwithinthescatterplotasdepictedinFigure14.

Figure14Threedimensionaldiscretedata.Thethirddimensionisgivenbya)pointsize,b)grayvalue,andc)phaseof oscillatorypointmotion.(Ware)

13

BIBLIOGRAPHICNETWORKandGROUNDWATERDATA Inexplorationofatopicorpickingaquestiontoresearch,itisusefultobeawareofwhatkindsofresearchis alreadygoingonintheareaofinterest.Explorationofbibliographicnetworks,whichshowtheconnectednessof documents,isawaytosetthestageforaparticularresearchquestionanddefinetherelevancyofatopic.An exampleofabibliographicnetworkherewascreatedusinggroundwateruncertainty.Therearemultiple informaticstoolsthatanalyzetheexistingnetworkofpublicationsinthistopic;hereSci2wasused. Sci2hastheabilitytoimportnumerousreferencesfromtheWebofSciencedatabaseandcreatesa networkvisualizationtoexploretherelevancyandconnectednessofatopic.Mybroadsearchyieldedmorethan 2,000resultsinWebofScience;Figure15showstheresultingvisualization.(AppendixIVisastepbystepguideto creatingthisvisualizationinSci2.)

Figure15Sci2networkanalysis

Links(edges)betweencircles(nodes)indicateoccasionswhenonearticlereferencedanotherone.Thesizeand colorofthenodes,eachrepresentinganindividualarticle/citationindicatethetimesthatthearticlehasbeencited byothersourcesthatitisconnectedto.Ideally,alabelshowingeithertitlesorjournaltopicswouldbemore

14

useful,butalimitationonknowledgetomanipulatethefinalimagepreventedthisandisbeingpursuedfurther. Withintheprogram,however,amousehoveringovereachnodeandedgerevealswhicharticleisrepresentedand connectedtoanother.Thesedescriptorsallowforcomparisononthemajorclustersofcitations. Fromadesignperspective,thewaythevisualizationisinitiallycreatedfromthealgorithminSci2results

innodesthatareallthesamesizeandcolorwiththicklinesoutliningeachnode.Alterationstothedesigncanbe madetoincorporatedesignprinciplesandmakethenetworkmorevisuallyappealingandeasiertosiftthroughfor theviewer.ThelayoutofFigure15isoneofseveraloptionsinSci2;thislayout(GEM)createsclustersthathelpto guidethevisualreader. Networksareapowerfultooltouseespeciallywhenlookingforemergentproperties.AppendixIIIshows

fifteenothertypesandstylesofnetworkvisualizations.Networkscangobeyondthebibliographicrealmandcan beusedtoexploreinteractionsamongmultiplevariables,exposingkeyrelationships.FromanEERperspective,a tellingnetworkvisualizationthatexposesarelationshipbetweenvariablesthathadnotbeenconsideredorwell understoodbeforecanbeexceptionallyusefulindecisionmakingsituations. Afteraninitialexplorationoftherelevancyofatopic,itistimetoexplorethatdataavailable.Intermsof groundwateruncertainty,thedatasethereisagroupofgroundwatersimulationruns(10,256tobeexact).Thisisa dauntingamountofnumberstobegintowraponesheadaround,soitisusefultobeabletotakethesetpiece wisetogetafeelforwhatattributesthedatacarriesandwhatthosepiecesaredoing.AswiththeEnergyand Economydataearlier,severalrunsofthedatawereputintoExcelasalinegraphtodetermineanytrends(Figure 16).Withlargedatasetsitisusefultogetafeelforwhatthegeneraltrendsinthedataarewithabasic visualization,likeabarchart,linegraph,orscatterplot,todeterminethenextbeststep. WaterTableLevelsfromGroundwaterAvailabilityModel: BartonSpringsAquifer,Zones111
1300 1200 1100 1000 Zone1 Zone10 Series3 Series4 Series5 800 700 600 500 400 1 2 3 4 5 6 7 8 9 10 Series6 Series7 Series20 Series21 Series22 Series23

Feet

900

Year

Figure16Watertablelevelsfromonedatarun.

15

Inthisgraph,ageneraltrendinthechangeofthewatertableoveratenyearperiodcanbeobserved.Eachline representsazonedenotedwithinthegroundwatermodel.InlinewiththeExcelrulementionedpreviously,all Excelgraphdefaultswereoverridden.Thecolorandweightofthelinewerechangedtocreateamorevisually appealingandeasytoreadchart.Fontsizeandlocationofnumbersontheaxeswerechanged.Horizontal gridlineswereminimizedsoasnottodistractfromthedata.Figure16ismissingdescriptorsforeachseries, although,theseareunimportantinthisparticularprocessastherearestillfurtherstepstobemadewiththedata andthisgraphwassimplyusedforaninitialgrasponwhatthedatasetcontains. ThelinegraphinFigure16includesonlyoneoutofthousandsofmodelruns.Thenextquestioninthis

processisifthereisanyvariationamongthenumerousdataruns.Toexplorethis,atreemapwascreatedfora 100counthandfulofthedata.TreemapsarebasedinhierarchicaldataasrepresentedbyFigure17.

Figure17a)atreemaprepresentationofhierarchicaldata.Areasrepresenttheamountofdatastoredinthetreedata structure.b)thesametreestructure,representedusingaconventionalnodelinkdiagram(Ware).

Thehierarchicalstructurewasnotwelldefinedinthegroundwaterdatayet,butaninitialimage(Figure18) determinedthattherewasvariationineachoftheruns.

Figure18InitialTreeMap

16

Significantdatamanipulationneedstobedoneonthedatasetstill,butthistreemapisastarttoshowinghowthis typeofvisualizationcanbeused.Here,onlyonemonthofdataforeachmodelrunisused,butforatrulyeffective treemap,thiswouldneedtouseeachmonthsrun,oranaverageofsomesort.Theimportantaspectinthisimage istheuseofcolor,mildandnotoverwhelming,andthetone.Withcolorvariation,itisbesttousedifferenttones ofonecolorsincethehumanminddoesnotassignahierarchytocolor,withexceptiontoredsandgreen,which haveageneralconnotationassociatedwiththem.Theproximityoftheboxeswithinthetreemapallowforeasy comparison.Thisimagestilllackslabels,butthesoftwarehasahoverfeaturelikeinSci2. Withthisparticulardataset,muchmorestatisticalanalysisneedstobecompleted,asmentioned,but

thesefewgraphicsgiveaninitialideaofwheretogonextindataanalysis.Inthefuture,anetworkanalysisand visualizationwouldbeidealforthisdataset.Beingabletoviewlinksofthegroundwaterparameterspresentinthe dataruns,particularlythelinkbetweenspringflow,watertablelevel,andpumping,canbeusefulinreallife decisionscenariosindetermininghowtobestmanagegroundwater.Ifavalidandeffectivevisualizationiscreated withthisdataset,theprocesscanthenbereplicatedforotherregionswithpressingwaterissues.

PARTIII:FURTHERTHOUGHTS
Thispaperhascoveredasmallselectionofthevastamountsofinformationthatexistsinregardstodata visualization.Thereareamultitudeofwaystodesignandcraftdatavisualization.Overthedurationofthe independentstudycourse,wehaveseenseveralplatitudesthatarecalledthedatavisualizationguidelinesand wehavequotedanddisplayedseveraloftheminthepaperinconjunctionwiththeVisualizationCreation Workflow.Figure19below,theSourceTrinity(IllinskyandSteele),displaystheconnectionbetweendata,design, andviewerandcanaidinhowdesignshouldbeapproachedfromabroaderperspective.

Figure19TheSourceTrinity

17

Thisfigureshowshowallthreeparticipantsofavisualization,reader/data/designer,shouldbeviewed.The interactionsbetweenthereaderanddesignerandbetweenreaderanddataarethetwotofocusonwithinanEER context.Oneofthefunctionsofavisualizationistomakeapointtothereader.Ifthevisualizationistryingto demonstratepositivejudgement,strategiesshouldbeusedinawaythatwillinformthereaderandwill[aim] foraneutralpresentationofthefactsinsuchawaythatwilleducatethereader(Illinksy/Steele).TheSource Trinitycaninformtheaveragepersonwiththecriticalinformationthatthevisualizationistryingtodisplayandnot withanotherdimensiontothedatawhichseekstoconvincethereaderofaspecifictypeofview.Inthissense,the designerisnotinsertingthemselvesintothevisualizationtomakeaneditorialjudgmentwiththedata.Thisisa moreformalroleforvisualizationsinanacademicorinformationprovidingrole. ThesecondandperhapsmoreimportantrelationshipintheSourceTrinityisthereaderdesigner relationship.Thisiswherethedesignerintroducesanormativepointtothevisualizationandisclearlyadvocatinga positionwiththedesignelementswhichtheyhavechosenandtopersuadethereaderoftheinformationtoshare thepointofviewwiththedesigner.Inthissituation,thedesigneristakingthedatathathasbeenmanipulatedand transformedintoavisualizationwheretheyaretakingaviewpointandapplyingthatslanttothevisualization.This isespeciallyimportanttounderstandifthevisualizationisinapolicyorconsultingsetting. Understandingtherelationshipbetweendatavisualizationanddecisionmakingisthechiefconcernof thispaper.TheEERprofessionalhasadutytorecognizingthattheroleofdecisionsupportthroughvisual quantitativeandqualitativeanalysistoshowwhatmustbedonewithinasystem,company,nationortheworld andusedatavisualizationtocraftthatanalysisinthebestwaypossible.

18

AppendixI
Ware(DataTypesMatrix)

19

AppendixII
IllinksyandSteeleDataVisualizationEncodingGuidelines:

20

AppendixIII
15TypesandStylesofNetworkVisualization

ArcDiagram

AreaGrouping

Centralized Burst

CentralizedRing

CircledGlobe

Circularities

Elliptical implosion

Flowchart

OrganicRhizome

RadialConvergence

RadialImplosion

Ramification

ScalingCircles

Segmentedand RadialConvergence

Sphere

21

Sources(lefttoright,toptobottom) 1. https://www.google.com/search?q=arc+diagram&source=lnms&tbm=isch&sa=X&ei=Tjr5Ud2XIZKiqwHp14D4BA&ved=0CAkQ_AUoAQ&bi w=1920&bih=995#facrc=_&imgdii=_&imgrc=LWZhDq801x0iM%3A%3BmpcLhHiGSkGi1M%3Bhttp%253A%252F%252Fwww.e rna.org%252Frchie%252Fimages%252Foverlap.png%3Bhttp%253A%252F%252Fwww.erna.org%252Frchie%252F%3B930%3B557 http://www.computationalgroup.com/tigertiger/cb/index.html http://www.isi.edu/division7/publication_files/heuristics.pdf http://d3.do/en/wpcontent/uploads/2011/10/circle.jpg http://www.telegeography.com/telecommaps/ http://musicovery.com/ http://www.visualcomplexity.com/vc/project_details.cfm?id=339&index=339&domain= http://www.visualcomplexity.com/vc/project_details.cfm?id=72&index=72&domain= https://www.google.com/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&cad=rja&docid=ZhaID_NefM5giM&tbnid=8ak99VobCTiefM:&v ed=0CAMQjhw&url=http%3A%2F%2Fdcook020.grads.digitalodu.com%2Fblog%2F%3Fp%3D37&ei=UkH5UbqpLYvoqAGs8oDYBA&bvm=bv. 49967636,d.aWM&psig=AFQjCNHth_tHiFHZgqUxJaW7rZ4YROKXNA&ust=1375376068803955

2. 3. 4. 5. 6. 7. 8. 9.

10. http://www.visualcomplexity.com/vc/project_details.cfm?id=278&index=278&domain= 11. http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1103573 12. http://www.schmuhl.org/graphopt/ 13. http://pages.cs.wisc.edu/~pavlo/papers/graphdrawing06.pdf 14. http://www.visualcomplexity.com/vc/project_details.cfm?id=142&index=142&domain= 15. http://moebio.com/spheres/english.html

22

AppendixIV
SciencetoScience(Sci2)ToolTutorial
DownloadthelatestversionoftheSci2tool:https://sci2.cns.iu.edu/user/welcome.php Sci2Manual: http://wiki.cns.iu.edu/display/SCI2TUTORIAL/Science+of+Science+%28Sci2%29+Tool+Manual?from=1oMh AccesstheWebofSciencedatabasethroughtheUTLibrarysystem: http://www.lib.utexas.edu/indexes/titles.php?let=W CollectRecordsfromWebofScience InWebofScience,conductasearchonyourtopicofchoice.Onceyouhaveyourresults,followthesesteps: 1. AtthebottomofthesearchresultspageisalightgrayboxentitledOutputRecords.Youcanexport500 recordsatatime.Youcanalsoselecttherecordsyouwanttodownload. SelectRecordsandenter1to500inthestep1. SelectFullRecordandCitedReferencesinstep2. SelectSavetootherReferenceSoftwareinthedropdowninstep3.SaveasTabdelimited (Windows/Mac)alsoworks. ClickSave.Savetheexportedrecordstoafilefolder,andnameappropriately(bysearchtopic).If youwillbeexportinginmultiplebatches,itishelpfultonameyourfilesappropriately(_a,_bor_1, _2)becauseyouwillcompileallrecordsinthenextstep. Repeattheprocessuntilallrecordshavebeenexported. Openthefirstexportedfile.AtthebeginningofthefileyouwillseeFNThomsonReutersWebof KnowledgeVR1.0andattheendyouwillseeEREF.ThesenotationssignifytoSci2thestartandendof therecords.IftheseexistmorethanonceintherecordsusedinSci2,onlythefirstportionwillbe analyzed.Inthefirstfile,deletetheEREFattheend.Savetextfile.Thiswillbeyourcompilationfile. Openthesecondexportedfile.DeleteFNThomsonReutersWebofKnowledgeVR1.0atthebeginning anddeleteEREFattheend.Copytheremainingtextandpasteintothecompilationfile.Repeatthisfor allremainingfilesexceptforthelastone. Openthelastexportedfile.OnlydeleteFNThomsonReutersWebofKnowledgeVR1.0fromthe beginning.KeepEREFtosignifythenendofrecords.Copyandpasteintocompilationfile.Save. Renamethecompilationfiletohavetheextension.isi.ISIformatistheoutputformatfromtheWebof Sciencedatabasethatcontainsauthor,citation,andfullabstractinformation. YouarenowreadytobeinganalysiswithSci2!

2.

3.

4. 5. 6.

Sci2:CreatingaCocitationNetwork TheSci2menuisarrangedlefttorighttogowiththeworkflow.Filescanbeloadedthencanbeprepared, preprocessed,analyzed,andvisualized.TheConsolewindowdocumentsoperationsperformedonthedata.The Schedulewindowindicatestheprogressofyouroperationsshowswhatoperationshavebeenperformed.The DataManagertabshowstheevolutionofyourdataafteryouhaveprocessedit. 1. File>Load. Select.isifile Loaddialogueboxwillappear.SelectISIflatformat. DataPreparation>ExtractDirectedNetwork SourceColumn>CitedReferences TargetColumn>CiteMeAs

2.

23

3. 4.

5. 6.

7.

Extract Thiscreatesadirectednetworkbyplacingadirectededgebetweenthevaluesinagivencolumnto thevaluesofadifferentcolumn DataPreparation>ExtractBibliographicCouplingNetwork Analysis>Networks>NetworkAnalysisToolkit(NAT) Thisperformsabasicanalysisonthenetwork,calculatingclusters,selfloops,paralleledges, numberofnodes,numberofedges,anddensityofanetwork(intheConsolewindow).Thisallows youtogetafeelforthenetworkandfindanyerrorsthatmaybepresentinthedata. SelectBibliographicCouplingSimilarityNetworkintheDataManagerwindow. Preprocessing>Networks>ExtractEdgesAboveorBelowValue Indialoguebox,enter4inExtractfromthisnumberbox.Thisalgorithmgetsridofanynodes thatareoutsideoftherangeyouareinterestedin. Withthenewedgesselected,Preprocessing>DeleteIsolates

WithWithisolatesremovedselected,Visualization>Networks>GUESS

24

WORKSCITED
Illinsky,NoahandSteele,Julie.DesigningDataVisualizations:RepresentingInformationalRelationships. Sebastopol:O'ReillyMedia,2011.EbookLibrary.Web.12Jul.2013. Lima,Manuel.VisualComplexity:Mappingpatternsofinformation.NewYork:PrincetonArchitecturalPress,2011. Tufte,EdwardR.BeautifulEvidence.Cheshire:GraphicsPress,2006. Tufte,EdwardR.TheVisualDisplayofQuantitativeInformation(2nded.).Cheshire:GraphicsPress,1983. Ware,Colin.InformationVisualization:PerceptionforDesign.Burlington:ElsevierScience,2012.EbookLibrary. Web.3Jul.2013. Yau,Nathan.VisualizeThis:theFlowingDataguidetodesign,visualization,andstatistics.Indianapolis:WileyPub., 2011.

25