Vous êtes sur la page 1sur 4

6/9/2016

KernelmethodWikipedia,thefreeencyclopedia

Kernelmethod
FromWikipedia,thefreeencyclopedia

Inmachinelearning,kernelmethodsareaclassofalgorithmsforpatternanalysis,whosebestknownmemberis
thesupportvectormachine(SVM).Thegeneraltaskofpatternanalysisistofindandstudygeneraltypesof
relations(forexampleclusters,rankings,principalcomponents,correlations,classifications)indatasets.Formany
algorithmsthatsolvethesetasks,thedatainrawrepresentationhavetobeexplicitlytransformedintofeature
vectorrepresentationsviaauserspecifiedfeaturemap:incontrast,kernelmethodsrequireonlyauserspecified
kernel,i.e.,asimilarityfunctionoverpairsofdatapointsinrawrepresentation.
Kernelmethodsowetheirnametotheuseofkernelfunctions,whichenablethemtooperateinahighdimensional,
implicitfeaturespacewithoutevercomputingthecoordinatesofthedatainthatspace,butratherbysimply
computingtheinnerproductsbetweentheimagesofallpairsofdatainthefeaturespace.Thisoperationisoften
computationallycheaperthantheexplicitcomputationofthecoordinates.Thisapproachiscalledthe"kernel
trick".Kernelfunctionshavebeenintroducedforsequencedata,graphs,text,images,aswellasvectors.
Algorithmscapableofoperatingwithkernelsincludethekernelperceptron,supportvectormachines(SVM),
Gaussianprocesses,principalcomponentsanalysis(PCA),canonicalcorrelationanalysis,ridgeregression,spectral
clustering,linearadaptivefiltersandmanyothers.Anylinearmodelcanbeturnedintoanonlinearmodelby
applyingthekerneltricktothemodel:replacingitsfeatures(predictors)byakernelfunction.
Mostkernelalgorithmsarebasedonconvexoptimizationoreigenproblemsandarestatisticallywellfounded.
Typically,theirstatisticalpropertiesareanalyzedusingstatisticallearningtheory(forexample,usingRademacher
complexity).

Contents
1
2
3
4
5
6
7
8

Motivationandinformalexplanation
Mathematics:thekerneltrick
Applications
Popularkernels
Seealso
Notes
References
Externallinks

Motivationandinformalexplanation
Kernelmethodscanbethoughtofasinstancebasedlearners:ratherthanlearningsomefixedsetofparameters
correspondingtothefeaturesoftheirinputs,theyinstead"remember"the thtrainingexample
andlearn
foritacorrespondingweight .Predictionforunlabeledinputs,i.e.,thosenotinthetrainingset,istreatedbythe
applicationofasimilarityfunction ,calledakernel,betweentheunlabeledinput andeachofthetraining
inputs .Forinstance,akernelizedbinaryclassifiertypicallycomputesaweightedsumofsimilarities
,

https://en.wikipedia.org/wiki/Kernel_method

1/4

6/9/2016

KernelmethodWikipedia,thefreeencyclopedia

where
isthekernelizedbinaryclassifier'spredictedlabelfortheunlabeledinput whosehidden
truelabel isofinterest
isthekernelfunctionthatmeasuressimilaritybetweenanypairofinputs

thesumrangesoverthenlabeledexamples
intheclassifier'strainingset,with

the
aretheweightsforthetrainingexamples,asdeterminedbythelearningalgorithm
thesignfunction
determineswhetherthepredictedclassification comesoutpositiveornegative.
Kernelclassifiersweredescribedasearlyasthe1960s,withtheinventionofthekernelperceptron.[1]Theyroseto
greatprominencewiththepopularityofthesupportvectormachine(SVM)inthe1990s,whentheSVMwasfound
tobecompetitivewithneuralnetworksontaskssuchashandwritingrecognition.

Mathematics:thekerneltrick
Thekerneltrickavoidstheexplicitmappingthatisneededtogetlinearlearningalgorithmstolearnanonlinear
functionordecisionboundary.Forall and intheinputspace ,certainfunctions
canbeexpressed
asaninnerproductinanotherspace .Thefunction
isoftenreferredtoasakernelorakernel
function.Theword"kernel"isusedinmathematicstodenoteaweightingfunctionforaweightedsumorintegral.
Certainproblemsinmachinelearninghaveadditionalstructurethananarbitraryweightingfunction .The
computationismademuchsimplerifthekernelcanbewrittenintheformofa"featuremap"
which
satisfies

Thekeyrestrictionisthat
mustbeaproperinnerproduct.Ontheotherhand,anexplicitrepresentationfor
isnotnecessary,aslongas isaninnerproductspace.ThealternativefollowsfromMercer'stheorem:an
implicitlydefinedfunction existswheneverthespace canbeequippedwithasuitablemeasureensuringthe
function satisfiesMercer'scondition.
Mercer'stheoremisakintoageneralizationoftheresultfromlinearalgebrathatassociatesaninnerproducttoany
positivedefinitematrix.Infact,Mercer'sconditioncanbereducedtothissimplercase.Ifwechooseasour
measurethecountingmeasure
forall
,whichcountsthenumberofpointsinsidetheset ,
thentheintegralinMercer'stheoremreducestoasummation

Ifthissummationholdsforallfinitesequencesofpoints
in andallchoicesof realvalued
coefficients
(cf.positivedefinitekernel),thenthefunction satisfiesMercer'scondition.
Somealgorithmsthatdependonarbitraryrelationshipsinthenativespace would,infact,havealinear
interpretationinadifferentsetting:therangespaceof .Thelinearinterpretationgivesusinsightaboutthe
algorithm.Furthermore,thereisoftennoneedtocompute directlyduringcomputation,asisthecasewith
supportvectormachines.Somecitethisrunningtimeshortcutastheprimarybenefit.Researchersalsouseitto
justifythemeaningsandpropertiesofexistingalgorithms.
https://en.wikipedia.org/wiki/Kernel_method

2/4

6/9/2016

KernelmethodWikipedia,thefreeencyclopedia

Theoretically,aGrammatrix

withrespectto

(sometimesalsocalleda"kernel

matrix"[2]),where

,mustbepositivesemidefinite(PSD).[3]Empirically,formachinelearning
heuristics,choicesofafunction thatdonotsatisfyMercer'sconditionmaystillperformreasonablyif atleast
approximatestheintuitiveideaofsimilarity.[4]Regardlessofwhether isaMercerkernel, maystillbereferred
toasa"kernel".
Ifthekernelfunction isalsoacovariancefunctionasusedinGaussianprocesses,thentheGrammatrix can
alsobecalledacovariancematrix.[5]
Finally,supposethat isasquarematrix.Then

isapositivesemidefinitematrix.

Applications
Applicationareasofkernelmethodsarediverseandincludegeostatistics,[6]kriging,inversedistanceweighting,
3Dreconstruction,bioinformatics,chemoinformatics,informationextractionandhandwritingrecognition.

Popularkernels
Fisherkernel
Graphkernels
Kernelsmoother
Polynomialkernel
RBFkernel
Stringkernels

Seealso
Kernelmethodsforvectoroutput

Notes
1.Aizerman,M.A.Braverman,EmmanuelM.Rozoner,L.I.(1964)."Theoreticalfoundationsofthepotentialfunction
methodinpatternrecognitionlearning".AutomationandRemoteControl25:821837.CitedinGuyon,IsabelleBoser,
B.Vapnik,Vladimir(1993).AutomaticcapacitytuningofverylargeVCdimensionclassifiers.Advancesinneural
informationprocessingsystems.CiteSeerX:10.1.1.17.7215.
2.Hofmann,ThomasScholkopf,BernhardSmola,AlexanderJ.(2008)."KernelMethodsinMachineLearning".
3.Mohri,MehryarRostamizadeh,AfshinTalwalkar,Ameet(2012).FoundationsofMachineLearning.TheMITPress.
ISBN9780262018258.
4.http://www.svms.org/mercer/
5.Rasmussen,C.E.Williams,C.K.I.(2006)."GaussianProcessesforMachineLearning".
6.Honarkhah,M.Caers,J.(2010)."StochasticSimulationofPatternsUsingDistanceBasedPatternModeling".
MathematicalGeosciences42:487517.doi:10.1007/s1100401092767.

References
ShaweTaylor,J.Cristianini,N.(2004).KernelMethodsforPatternAnalysis.CambridgeUniversityPress.
Liu,W.Principe,J.Haykin,S.(2010).KernelAdaptiveFiltering:AComprehensiveIntroduction.Wiley.

Externallinks
https://en.wikipedia.org/wiki/Kernel_method

3/4

6/9/2016

KernelmethodWikipedia,thefreeencyclopedia

KernelMachinesOrg(http://www.kernelmachines.org)communitywebsite
www.supportvectormachines.org(http://www.supportvectormachines.org)(Literature,Review,Software,
LinksrelatedtoSupportVectorMachinesAcademicSite)
onlineprediction.netKernelMethodsArticle(http://onlineprediction.net/?n=Main.KernelMethods)
Retrievedfrom"https://en.wikipedia.org/w/index.php?title=Kernel_method&oldid=709375900"
Categories: Kernelmethodsformachinelearning Geostatistics Classificationalgorithms
Thispagewaslastmodifiedon10March2016,at15:42.
TextisavailableundertheCreativeCommonsAttributionShareAlikeLicenseadditionaltermsmayapply.
Byusingthissite,youagreetotheTermsofUseandPrivacyPolicy.Wikipediaisaregisteredtrademark
oftheWikimediaFoundation,Inc.,anonprofitorganization.

https://en.wikipedia.org/wiki/Kernel_method

4/4

Vous aimerez peut-être aussi