Académique Documents
Professionnel Documents
Culture Documents
SupportVectorMachine
HodaAlhameidyandNaflaAlDarei
CollegeofInformationTechnology,UAEU
Email:[20012554;200812303]@uaeu.ac.ae
1. Introduction
It is common for children to have the problem of
mixing up letters such asbandd,pandq,mandn,
and other letters. The principle that children learn
lettersbymemorizingthelettersshapesturnedout
tobenotaccurate.Mostlyaletterisrecognizedby
its sequence of features, not as a whole shape.
Therefore, the best way to teach children how to
writelettersisthroughasequenceoffeatureswhich
reflect the printing practice of letters. It is highly
desired to develop a method which will enable
children to write English letters and the system
automatically checks for the correctness and the
readabilityoftheletter.Tothisend,severalsystems
wheredevelopedinthepastforrecognizingletters.
Oneofthewidelyusedtechniqueswhichwasableto
recognize characters successfully is the chain code
[1].Themainadvantageofusingchaincodeoverthe
traditional representation of a binary object is that
the chain code is a complete representation of an
objectorcurve(letterinthiscase).Thismeansthat
wecancomputeanyshapesfeaturesfromthechain
codes.
2. Method
TheoverviewofthemethodisshowninFigure1.
Figure1:AnOverviewoftheproposedmethodfor
recognizingEnglishletters.
Themethodstartsbyextractingthefeaturesofeach
letter using the Chain codes. Chain codes use to
represent a boundary by a connected sequence of
straight line segments of specified length and
direction.Thisrepresentationinthiscaseisbasedon
8connectivity of the segments [2]. The direction of
eachsegmentiscodedbyusinganumberingscheme
as shown in Figure 2. Chain codes based from this
schemeareknownasFreemanchaincodes.
Figure2:Thedirectionrepresentation.
TP FP
TP FN
(
where n in this case is the total number of
the samples. The receiver operating characteristic
(ROC) was also calculated to show the tradeoff
betweenthehitrateandfalsealarmrateovernoisy
channel. Several machine learning techniques were
comparedasshowninFigure3.ItisclearthatSVM
performance (Precision = 0.33, Recall = 0.25, ROC =
0.84) is superior to many machine learning
techniques such as Decision tree (J48), Neural
Network (1NN and 10NN). However, the
performance of the Nave Bayes technique is
equivalenttoSVM.
Precision
Accuracy
3. Experimentalwork
Recall
ROC
1
0.8
0.6
0.4
0.2
0
J48
MachineLearningMethod
Figure3:Comparemachinelearningmethods.
4. Conclusion
In this paper, a method for recognizing English
letterswasproposedandevaluated.Themethodhas
a direct application in teaching children letters and
the proper handwriting. The overall accuracy is
considered well enough for such a multiple
classificationtaskof26differentclasses.
References
[1]Jahne,B.(2005).DigitalImageProcessing.6thEd.New
York:Springer.
[2]Gonzales,R.C.andWoods,R.E.(2002).DigitalImage
Processing.2ndEd.UpperSaddleRiver,N.J.:Prentice
Hall,Inc.
[3]NazarZaki,SafaaiDerisandChinK.K.(2003).A
comparisonofquadraticprogrammingsolversinsupport
vectormachinestraining.JurnalTeknologi.Vol.39,pp:
4556.
[4]NazarZaki,SafaaiDerisandRosliIllias(2005).
Applicationofstringkernelsinproteinsequence
classification.AppliedBioinformatics.Vol.4,pp:4552.
[5]NazarZaki,SanjaLazarovaMolnar,WassimElHajj,
PiersCampbell,(2009).Proteinproteininteractionbased
onpairwisesimilarity.BMCBioinformatics.10:150.
[6]NazarZaki,StefanWolfsheimer,GregoryNueland
SawsanKhuri(2011)."ConotoxinProteinClassification
UsingFreeScoresofWordsandSupportVector
Machines",BMCBioinformatics,12:217.