Vous êtes sur la page 1sur 5

19/01/2015

CombiningColorandSpatialInformationforContent

CombiningColorandSpatialInformationforContentbased
ImageRetrieval
JingHuang
RaminZabih
ComputerScienceDepartment
CornellUniversity
Ithaca,NY14853

ABSTRACT
Muchoftheinformationstoredindigitallibrarieswillcontaineitherimagesorvideo,whichisdifficulttosearchorbrowse.Automatic
methodsforsearchingimagecollectionsmakewideuseofcolorhistograms,becausetheyarerobusttolargechangesinviewpoint,and
canbecomputedtrivially.However,colorhistogramsfailtoincorporatespatialinformation,andthereforetendtogivepoorresults.We
havedevelopedseveralmethodsforcombiningcolorinformationwithspatiallayout,whileretainingtheadvantagesofhistograms.One
techniquecomputesthedistributionofagivencolorasafunctionofthedistancebetweentwopixels.Theresultingmethod,whichwecall
acolorcorrelogram,hasproventobequiteeffectiveevenwithverycoarselyquantizedcolorinformation.Anothermethodcomputes
jointhistogramsoflocalproperties,thusdividingpixelsintoclassesbasedonbothcolorandspatialproperties.Experimentswitha
databaseofover200,000imagesdemonstratethatthesemeasuresperformsignificantlybetterthancolorhistograms,especiallywhenthe
numberofimagesislarge.

INTRODUCTION
Oneoftheprimarychallengesindigitallibrariesistheproblemofprovidingintelligentsearchmechanismsformultimediacollections.
Whiletherearegoodtoolsforsearchingtextcollections,imagesaremuchmoredifficult.Iftheimagesareannotatedbyhand,atextual
searchcanbeusedhowever,thisapproachistoolaborintensivetoscaleupwithlargedigitallibraries.Automatedmethodsforsearching
largeimagecollectionsarethereforenecessary.Thisinturnrequiressimpleandeffectiveimagefeaturesforcomparingimagesbasedon
theiroverallappearance.Colorhistogramsarewidelyused,forexampleby[QBIC],[Chabot]and[Photobook].Thehistogramiseasyto
computeandisinsensitivetosmallchangesinviewingpositions.Ahistogramisacoarsecharacterizationofanimage,however,and
imageswithverydifferentappearancescanhavesimilarhistograms.Forexample,theimagesshowninfigure1havesimilarcolor
histograms.Whenimagedatabasesarelarge,thisproblemisespeciallyacute.

Figure1:Twoimageswithsimilarcolorhistograms
http://www.cs.cornell.edu/rdz/Papers/ecdl2/spatial.htm

1/5

19/01/2015

CombiningColorandSpatialInformationforContent

Sincehistogramsdonotincludeanyspatialinformation,recentlyseveralapproacheshaveattemptedtoincorporatespatialinformation
withcolor[Hsu,Stricker,Smith].Thesemethods,however,losemanyoftheadvantagesofcolorhistograms.Inthispaperwedescribe
methodsforcombiningcolorinformationwithspatiallayoutwhileretainingtheadvantagesofhistograms.Onemethodcomputesthe
spatialcorrelationofpairsofcolorsasafunctionofthedistancebetweenpixels.Wecallthisfeatureacolorcorrelogram(theterm
``correlogram''isadaptedfromspatialdataanalysis[Upton])Anotherapproachisbasedoncomputingjointhistogramsofseverallocal
properties.Jointhistogramscanbecomparedasvectors,justascolorhistogramscan.However,inacolorhistogramanytwopixelsofthe
samecolorareeffectivelyidentical.Withjointhistograms,pixelsmustshareseveralpropertiesbeyondcolor.Wecallthisapproach
histogramrefinement.Themethodswedescribeareeasytocompute,andtheyproduceconcisesummariesoftheimage.
Wewillnextdescribecolorcorrelogramsandhistogramrefinement(fordetailssee[Huang]and[Pass].Wehaveevaluatedthesemethods
usingalargedatabaseofimages,ontaskswithasimple,intuitivenotionofgroundtruth.Theexperimentalresultsthatwepresentshow
thatourmethodsaresignificantlymoreefficientthancolorhistograms.

COLORCORRELOGRAMS
Acolorcorrelogram(henceforthcorrelogram)expresseshowthespatialcorrelationofpairsofcolorschangeswithdistance.Informally,a
correlogramforanimageisatableindexedbycolorpairs,wherethedthentryforrow(i,j)specifiestheprobabilityoffindingapixelof
colorjatadistancedfromapixelofcoloriinthisimage.HeredischosenfromasetofdistancevaluesD(see[Huang]fortheformal
definition).Anautocorrelogramcapturesspatialcorrelationbetweenidenticalcolorsonly.Thisinformationisasubsetofthecorrelogram
andconsistsofrowsoftheform(i,j)only.Anexampleautocorrelogramisshowninfigure2.
Sincelocalcorrelationsbetweencolorsaremoresignificantthanglobalcorrelationsinanimage,asmallvalueofdissufficienttocapture
thespatialcorrelation.Wehaveanefficientalgorithmtocomputethecorrelogramwhendissmall.Thiscomputationislinearinthe
imagesize(see[Huang]).

Image1

Image2

Figure2:Twoimageswiththeirautocorrelograms.Notethatthechangein
spatiallayoutwouldbeignorebycolorhistograms,butcausesasignificant
differenceintheautocorrelograms.
Thehighlightsofthecorrelogrammethodare:(i)itincludesthespatialcorrelationofcolors,and(ii)itcanbeusedtodescribetheglobal
distributionoflocalspatialcorrelationofcolorsifDischosentobelocal(seeourexperimentaldata).Anadditionaladvantageliesinthe
abilityofourmethodstosucceedwithverycoarsecolorinformation.Asweshowin[Huang],ourdatasuggeststhat8colorcorrelograms
http://www.cs.cornell.edu/rdz/Papers/ecdl2/spatial.htm

2/5

19/01/2015

CombiningColorandSpatialInformationforContent

performbetterthan64colorhistograms.
Unlikepurelylocalproperties,suchaspixelposition,gradientdirection,orpurelyglobalproperties,suchascolordistribution,
correlogramstakeintoaccountthelocalcolorspatialcorrelationaswellastheglobaldistributionofthisspatialcorrelation.Whileany
schemethatisbasedonpurelylocalpropertiesislikelytobesensitivetolargeappearancechanges,(auto)correlogramsaremorestableto
thesechangeswhileanyschemethatisbasedonpurelyglobalpropertiesissusceptibletofalsepositivematches,(auto)correlograms
provetobequiteeffectiveforcontentbasedimageretrievalfromalargeimagedatabase.

HISTOGRAMREFINEMENT
Inhistogramrefinementthepixelsofagivenbucketaresubdividedintoclassesbasedonlocalfeatures.Therearemanypossiblefeatures,
includingtexture,orientation,distancefromthenearestedge,relativebrightness,etc.Ifweconsidercolorasarandomvariable,thena
colorhistogramapproximatesthevariable'sdistribution.Histogramrefinementapproximatesthejointdistributionofavarietyoflocal
properties.
Histogramrefinementpreventspixelsinthesamebucketfrommatchingeachotheriftheydonotfallintothesameclass.Pixelsinthe
sameclasscanbecomparedusinganystandardmethodforcomparinghistogrambuckets(suchastheL1distance).Thisallowsfine
distinctionsthatcannotbemadewithcolorhistograms.
Forexample,considerajointhistogramthatcombinescolorinformationwiththeintensitygradient.Agivenpixelinanimagehasacolor
(inthediscretizedrange0...ncolors1andanintensitygradient(inthediscretizedrange0...ngradient1).Thejointhistogramfor
colorandintensitygradientwillcontain(ncolorsxngradient)entries.Eachentrycorrespondstoaparticularcolorandaparticular
intensitygradient.Thevaluestoredinthisentryisthenumberofpixelsintheimagewiththatcolorandintensitygradient.
Moreprecisely,givenasetofkfeatures,wecanconstructajointhistogram.Ajointhistogramisakdimensionalvector,suchthateach
entryinthejointhistogramcontainsthenumberofpixelsinanimagethataredescribedbyaktupleoffeaturevalues.Thesizeofthe
jointhistogramisthereforethenumberofpossiblecombinationsofthevaluesofeachfeature.Justasacolorhistogramapproximatesthe
densityofpixelcolor,ajointhistogramapproximatesthejointdensityofseveralpixelfeatures.
Jointhistogramsthusincreasethedimensionalityofthehistogramspacewithoutchangingthecapacityofeachfeature'sindividual
histogramspace.Thispreservestherobustnessofeachfeature,whileincreasingthecapacityofthehistogramspace.

EXPERIMENTALRESULTS
Forourexperiments,wehaveconcentratedon"querybyexample",wheretheuserspecifiesanimage,andthesystemattemptstoretrieve
themostsimilarimagesfromthedatabase.Wehaveusedalargeimagecollectionofalmost250,000images.Ourcollectioncontainsthe
databasesusedbyQBIC(1,440images)andChabot(11,667),aswellas200,000framesfromCNNtakenoneminuteapart.Wehave
identifiedbyhand52pairsofimageswherethereisaunique"rightanswer"inthedatabase,andusedtheseimagesasbenchmarks.More
specifically,theseareimagepairswherethesamesceneisshownfromtworatherdifferentviews.
Onthisdatabase,ourmethodsperformsignificantlybetterthancolorhistograms.Somespecificexamplesaregiveninfigures3and4,
usingbothcolorcorrelogramsandjointhistograms.

Colorhistogramrank:411Autocorrelogramrank:1

http://www.cs.cornell.edu/rdz/Papers/ecdl2/spatial.htm

3/5

19/01/2015

CombiningColorandSpatialInformationforContent

Colorhistogramrank:310Autocorrelogramrank:5

Colorhistogramrank:367Autocorrelogramrank:1
Figure3:Examplequeryimagesandcorrectanswers,andtherankofthe
correctanswerusingcolorhistogramsorautocorrelograms.Lowernumbers
indicatebetterperformance.

Colorhistogramrank:308Jointhistogramrank:2

Colorhistogramrank:1896Jointhistogramrank:3

Colorhistogramrank:649Jointhistogramrank:2
Figure4:Examplequeryimagesandcorrectanswers,and
therankofthecorrectanswerusingcolorhistogramsor
jointhistograms.Lowernumbersindicatebetter
performance.
Wehavealsoperformedastatisticalanalysisofthisdatatosavespace,wewillonlypresenttheseresultsforjointhistograms(theresults
forcorrelogramsarequitesimilar).Mostmeasuresusedbyauthorstoevaluateretrievalperformance,suchasprecision[Salton],are
dependentonthenumberofimagesinthedatabase.Webelievethataretrievalperformance
measureshouldbeindependentofthenumberofimages.Typicallyauseriswillingtobrowseacertainnumberoftheretrievalresultsby
hand,similartotextbasedsearchontheweb.Thisnumberisunlikelytochangeasthedatabasefluctatesinsize,asitisreallyameasure
ofhumanpatience.Wecallthisnumberthescopeoftheuser.Agoodperformancemeasureshouldjudgetheretrievalmethodwithina
particularscope.
Forthe52queries,weaskwhatpercentofthe52answerswerefoundwithinaparticularscope.Thepercentageofcorrectanswersis
calledtherecallintheinformationretrievalliterature[Salton].Theseresultsareshowninfigure5forscopesof1and100.Notethatjoint
histogramshaveahigherrecalllevelatascopeof1thancolorhistogramshaveforascopeof100.Thusauserwhowasonlywillingto
http://www.cs.cornell.edu/rdz/Papers/ecdl2/spatial.htm

4/5

19/01/2015

CombiningColorandSpatialInformationforContent

lookatthetopimagereturnedusingjointhistogramswoulddobetterthanauserwillingtolookatthetop100imagesreturnedusing
colorhistograms.

Algorithm

Recallatscope1

Colorhistograms
Jointhistograms

2%
60%

Recall
at
scope
100
40%
94%

Figure5:Scopeversusrecallresults.Highernumbersindicatebetter
performance.

REFERENCES
[Chabot95]VirginiaOgleandMichaelStonebraker.Chabot:Retrievalfromarelationaldatabaseofimages.IEEEComputer,28(9):40
48,September1995.
[Hsu95]WynneHsu,T.S.Chua,andH.K.Pung.Anintegratedcolorspatialapproachtocontentbasedimageretrieval.InACM
MultimediaConference,pages305313,1995.
[Huang97]JingHuang,S.RaviKumar,MandarMitra,WeiJingZhu,andRaminZabih.Imageindexingusingcolorcorrelograms.In
IEEEConferenceonComputerVisionandPatternRecognition,pages762768,1997.
[Huang97]JingHuang,S.RaviKumarandRaminZabih.AnAutomaticHierarchicalImageClassificationScheme,ACMMultimedia
Conference,pages219228,1998.
[Pass98]GregPassandRaminZabih.Comparingimagesusingjointhistograms.JournalofMultimediaSystems,1998(toappear).
[Photobook96]AlexPentland,RosalindPicard,andStanSclaroff.Photobook:Contentbasedmanipulationofimagedatabases.
InternationalJournalofComputerVision,18(3):233254,June1996.
[QBIC95]MyronFlickner,HarpreetSawhney,WayneNiblack,JonathanAshley,QianHuang,ByronDom,MonikaGorkani,JimHafner,
DenisLee,DragutinPetkovic,David
Steele,andPaterYanker.Querybyimageandvideocontent:TheQBICsystem.IEEEComputer,28(9):2332,September1995.
[Salton89]GerardSalton.AutomaticTextProcessing.AddisonWesley,1989.
[Smith96]J.R.SmithandS.F.Chang.VisualSEEK:Afullyautomatedcontentbasedimagequerysystem.InACMMultimedia
Conference,pages8798,November1996.
[Stricker96]MarkusStrickerandAlexanderDimai.Colorindexingwithweakspatialconstraints.SPIEproceedings,2670:2940,
February1996.
[Upton85]GrahamJ.G.UptonandBernardFingleton.SpatialDataAnalysisbyExample,volumeI.JohnWiley&Sons,1985.

http://www.cs.cornell.edu/rdz/Papers/ecdl2/spatial.htm

5/5

Vous aimerez peut-être aussi