Académique Documents
Professionnel Documents
Culture Documents
ORG
24
Index Terms Correlation, wavelet transform, ReliefF, factor analysis, historical documents.
1 INTRODUCTION
HE digitization of historical documents is defined as a set of steps that usually starts with the digitization which includes different stages; essentially pre-processing, segmentation, analysis and recognition. Each step involves several problems, each with a specific degree of difficulty [14, 16, 26, 27]. In our research, we address the problem of the characterization of images drawn from historical documents, using textural analysis. This phase is crucial for several applications such as physical and logical segmentation, Optical Character Recognition, indexing and content-based image retrieval. Indeed, it can extract information that describes the document without prior knowledge of the semantics or the structural content. Thus the content of image documents can be viewed as different textures: text, background, graphics, title, etc. Characterization methods of texture can be divided into four families [21].
A. Kricha is with the National Engineering School of Monastir, Tunisia, Monastir university, UR: SAGE (Systmes Avancs en Gnie Electrique. N.Essoukri. Ben Amara is with the National Engineering School of Sousse, Tunisia, Sousse university, UR: SAGE (Systmes Avancs en Gnie Electrique)
JOURNAL OF COMPUTING, VOLUME 3, ISSUE 10, OCTOBER 2011, ISSN 2151-9617 HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING WWW.JOURNALOFCOMPUTING.ORG
25
correlation. The autocorrelation of each sub-band allows us to have an idea about different patterns present in the analysis window. As there is some dependency between different sub-bands, we decided to exploit the correlation between the approximation and detail sub-bands at each level of decomposition.
a)
b)
c)
Fig.1 Images having the same first order features extracted from wavelet transform.
On the other hand, the majority of features used in the literature are extracted from each sub-band separately, which ignores the existing correlation between subbands at the same level of decomposition. Indeed, different studies show the presence of a relationship between sub bands of the same level [12]. This relationship has been proven essential for the reconstruction and texture especially for its characterization [5]. To remedy these insufficiencies and exploit the dependency between sub bands of the same level of decomposition, taking into consideration the spatial location of frequencies, we decided to exploit the correlation between the approximation and detail sub-bands and autocorrelation of each sub-band.
2.1 Correlation
Recall that the correlation of two images measures their mutual dependence, the autocorrelation of an image then measures the internal dependencies, eg a strongly regular image will have a high auto-
JOURNAL OF COMPUTING, VOLUME 3, ISSUE 10, OCTOBER 2011, ISSN 2151-9617 HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING WWW.JOURNALOFCOMPUTING.ORG
26
We calculate for each window (j)1..4 and for each level of decomposition (i)1..3, the matrix (Xki,j)i=1..3, j=1..4, k=1..7 : X i1, j =(AH)i,j: correlation between the approximation and
X i6, j =(VV)i,j : autocorrelation of vertical details. X i7, j =(DD)i,j: autocorrelation diagonal details.
horizontal details. X i2, j =(AV)i,j: correlation between the approximation and the vertical details. X i3, j =(AD)i,j: correlation between the approximation and diagonal details. X i4, j =(AA)i,j: autocorrelation approximation.
X i5, j =(HH)i,j: autocorrelation horizontal details.
So, each block is associated with four windows, each window is associated with 3 decomposition levels and each level is associated with 7 matrices, and for each matrix 4 features are extracted, that to say in total 336 features. Figure 3 illustrates the features offered and the methodology adopted to retain the most discriminating primitives. In what follows we will call the main features, the indices corresponding to one analysis window to a single level of decomposition, ie 28 features.
Features exraction
j=4
A3 H2 3 V3 AD3 H3 H2 3 V3 AD3 3H3 H3 A H2 V2 V3 DD2 3 V3 D3 V2 D2 V2 V1 V1 V1 H3
j=3 j=2
H1
H2 D2
D1
H1 H1
j=1
Wavelet Transform
V2
D2
H1
D1
V1
D1
D1
( X ik, j )i 1..3 =
j 1..4 k 1..7
X i3, j =Cor(Ai,j,Di,j), X i4, j =Cor(Ai,j,Ai,j) X i5, j =Cor(Hi,j,Hi,j), X i6, j =Cor(Vi,j,Vi,j) i=1..3 j=1..4 X i7, j =Cor(Di,j,Di,j)
Features selection
k=1..7
Factor analysis
Labels
Retained features
Fig.3 Scheme of the proposed approach.
JOURNAL OF COMPUTING, VOLUME 3, ISSUE 10, OCTOBER 2011, ISSN 2151-9617 HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING WWW.JOURNALOFCOMPUTING.ORG
27
dows and at each level of decomposition, which shows the invariance and robustness of the proposed features. After this step, we keep only 168 primitives.
3.2
3.1
In the literature there are two types of selection algorithms: supervised and unsupervised ones [1, 3, 8]. As we may have characteristics that may affect the results, we chose to use a supervised selection method. The principle is to select the subset of features to better discriminate the different classes of data. We chose the algorithm ReliefF [10]. This algorithm does not simply eliminate redundancy but defines a criterion of relevance. This test measures the ability of each feature to consolidate data from the same label and discriminate those of different labels. The weight of a feature is even larger than the data from the same class have similar values and that data from different classes are well separated. Figure 4 illustrates the relevance of 28 main indices in every window for all levels of decomposition of an image from our test database.
Window 16x16 100 80 Pertinence 60 40 20 0 0 10 20 Features Window 64x64 30 level 1 level 2 level 3 100 80 Pertinence 60 40 20 0 0 10 20 Features Window 128x128 30 Window 32x32 level 1 level 2 level 3
Once the irrelevant features are eliminated, we study the dependence between the features. To achieve this we conduct a factor analysis of characteristics through the maximum likelihood estimator [4]. Figure 5 show that for a given window, all of the averages (AH) are correlated for all levels of decomposition. However the average characteristics (AH) is not correlated between different windows. Based on previous results (generalized on all the features and applied to multiple images), we retain only one level of decomposition. Following this study, we can keep only those 56 characteristics.
j 1 .. 3 Cj j 4 .. 6 moy ( AH , j ,16 x16 ) moy ( AH , j ,128 x128 )
1 Component 2
C5 C4 C6
C3 C1 C2
C1 C2 C3
-1 -1
0 Component 1
-1 -1
0 Component 1
After this study, we consider the following features: Standard deviation ((Xki,j)i=1, j=1..4, k=1..7) and Average ((Xki,j) i=1, j=1..4, k=1..7), with: X2i,j=corr(Ai,j,Vi,j), X3i,j=corr(Ai,j,Di,j), X1i,j=corr(Ai,j,Hi,j), X4i,j=corr(Ai,j,Ai,j), X5i,j=corr(Hi,j,Hi,j), X6i,j=corr(Vi,j,Vi,j), X7i,j=corr(Di,j,Di,j), i: level of decomposition and j=1,2,3,4 correponds to : window.16x16, 32x32, 64x64, 128x128.
Pertinence
100 80 60 40 20
30
10 20 Features
30
The analysis of the degrees of relevance of the different characteristics shows that the moments of order 3 and the moments of order 4 are not discriminating for all windows of analysis to all levels of decomposition. The mean and standard deviation of the autocorrelation corresponding to the approximation in all the windows at all levels of decomposition provide large degree of relevance. We can also notice in Figure 4, that the features remain almost with the same relevance in all the win-
The characterization step has permitted us to select the discriminating primitives which allows us to separate the text from the graphic. A classification applied in the space of features allows us then to find the different classes present in a document. Two types of classifiers can be used: supervised classifiers and unsupervised classifiers. In our work, we chose to apply an unsupervised classification of type k-means. We applied a classification on about twenty historical documents from the National Library of Tunisia, to separate text, background and image (Figure 6). Although the used classifier is simple and unsupervised, it permitted to separate the three major classes (text, background, image) existing in the studied documents, which proves the relevance of the retained features.
JOURNAL OF COMPUTING, VOLUME 3, ISSUE 10, OCTOBER 2011, ISSN 2151-9617 HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING WWW.JOURNALOFCOMPUTING.ORG
28
Original Images
Classification results
In order to evaluate our approach with other type of images, we have exploited the retained features to separate the textures on the Broadtz database; the results are illustrated in Figure 7.
a. Document with Gaussian noise Fig.7 Texture separation using selected features. Also, we have verified the relevance of our features to separate different kinds of texts. Figure 8 shows the result of segmentation of a document with mixed text including Latin, Arabic and Hebrew characters.
b. Result of classification
Fig.9 Robustness of the selected features for the separation of Arabic, Latin and Hebrew texts in the presence of Gaussian noise.
b. Result of classification
Fig.10 Robustness of the selected features for the separation of Arabic, Latin and Hebrew texts in the presence of salt and pepper noise.
a. Document
b. Rsult of classification
Fig.8 Performance of the selected features for the separation of Arabic, Latin and Hebrew texts.
5 CONCLUSION
In this work, we focused mainly on the characterization of images of historical documents for a possible physical segmentation. First, we proposed features from the wavelet transform that allows us to maximize the properties of this technique. Then we studied the relevance and the dependence of the characteristics through the algorithm ReliefF and factor analysis, respectively. A classification stage has to find the different components of an image of a document through a simple classifier like k-means. To eliminate the noise classification, we proposed a stage of
Figure 8 confirms the good discriminatory power of proposed features. In effect, by considering each alphabet as a texture, we could separate the Arabic, Latin and Hebrew texts. Figures 9 and 10 show the robustness of our features side noise. In fact adding noise (Gaussian and salt and pepper noise), the classification results remain virtually unchanged.
JOURNAL OF COMPUTING, VOLUME 3, ISSUE 10, OCTOBER 2011, ISSN 2151-9617 HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING WWW.JOURNALOFCOMPUTING.ORG
29
post-treatment based on some operators from mathematical morphology. The proposed method was applied to about twenty images of historical documents from the National Library of Tunisia and the results are considered encouraging. We are under study of the relevance of characteristics in other applications for the identification of the font in a multi font context OCR, the writer identification or for the separation of multi-alphabet text.
ACKNOWLEDGMENT
We would like to thank the National Library of Tunisia for providing images of historical documents.
REFERENCES
[1] Blum Avrim L., Pat Langley , "Selection of relevant features and examples in machine learning", Artificial Intelligence journal, special issue on relevance, vol. 97, pp. 245271, 1997. P. Gupta, N. Vohra, S. Chaudhury, S. Joshi. "Wavelet based page segmentation", Indian Conference on Computer Vision, Graphics and Image Processing , ICVGIP, pp. 51-56, 2000. (Guyon et al. 2003) Guyon I, Elisseeff A, "An introduction to feature and variable selection", Journal of Machine Learning Research, vol. 3, pp. 11571182, 2003. Harry H. Harman, Modern Factor Analysis, 3rd Edition, University of Chicago Press, Chicago, 1976. P. S. Hiremath, S. Shivashankar, "Wavelet based co-occurrence histogram features for texture classification with an application to script identification in a document image", Pattern Recognition Letters, vol 29, Issue 9, pp. 1182-1189, 2008. Jia Li, James Ze Wang, and Gio Wiederhold, "Classification of textured and non-textured images using region segmentation," International Conference on Image Processing, pp. 754-757, September, 2000. N.Journet, Analyse dimages de documents anciens: une approche texture, Thse de doctorat, universit de La Rochelle, 2006. Kohavi R, John G. H, "Wrappers for feature subset selection", Artificial Intelligence journal, special issue on relevance, vol. 97, Issue 1-2, pp. 273324, December 1997. H. Li, D. Doerman, and O. Kia, "Automatic Text Detection and Tracking in Digital Video", IEEE Transactions on Image Processing, vol 9, Issue 1, pp. 147-156, January 2000. Marko Robnik-Sikonja, Igor Kononenko, "Theoretical and Empirical Analysis of ReliefF and RReliefF", Machine Learning Journal, vol 53, Issue 1-2, pp. 23-69, October-November 2003. Mihran Tuceryan, Anil K. Jain, "Texture Analysis", The Handbook of Pattern Recognition and Computer Vision (2nd Edition), by C. H. Chen, L. F. Pau, P. S. P. Wang (eds.), World Scientific Publishing Co., pp. 207-248, 1998. Portilla, Javier, Simoncelli, E.P.,"A parametric texture model based on joint statistics of complex wavelet coefficients", International Journal Computer Vision, vol 40, Issue 1, pp. 49-70, 2000. Sahbani Mahersia Hela, Hamrouni Kamel, "Segmentation dimages textures par transformes en ondelettes et classification C-moyenne floue," International conference :Sciences of Electronic,Technologies of information and Telecommunications, SETIT, Mars 2005. Mohamed Kricha, "Contribution lindexation des documents
[2]
[3]
[4] [5]
[6]
[7]
[8]
[9]
[10]
anciens, mastre en Systmes Intelligentes et communicants, Ecole Nationale dIngnieurs de Sousse, 2011. [15] Ying Liu, "Texture segmentation based on features in wavelet domain for image retrieval", Visual Communications and Image Processing, Lugano, Switzerland, vol. 5150, issue 3, pp.20262034, July 2003. [16] Amina Ghardallou Lasmar, " Prtraitement des documents anciens arabes par ondlettes", Mastre, Facult des Sciences de Monastir, 2005-2006. [17] Campbell F.W, J.G. Robson, "Application of Fourier Analysis to the Visibility of Gratings", Journal of Physiology, pp. 551-566, 1968. [18] W. Chan, G. Coghill, "Text analysis using local energy", Pattern Recognition, 34(12), pp. 2523-2532, December 2001. [19] S.S. Raju, P.B. Pati, A.G. Ramakrishnan,"Text localization and extraction from complex color images", ISVC, vol 380, pp.486493, 2005. [20] J. Li, R.M. Gray, "Context-based multiscale classification of document images using wavelet coefficient distributions", Image Processing, IEEE Transactions on image processing, Vol 9, pp. 1604-1616, Septembre 2000. [21] M. Tuceryan, A. K. Jain, "Texture analysis", The Handbook of Pattern Recognition and Computer Vision (2me Edition), pp. 207248, 1998. [22] Bela Julesz, "Textons, the elements of texture perception, and their interaction", Nature, no. 290, pp. 91-97, 12 Mars 1981. [23] Yongsheng Dong and Jinwen Ma,"Wavelet-Based Image Texture Classification Using Local Energy Histograms", IEEE SIGNAL PROCESSING LETTERS, vol. 18, NO. 4, pp. 247-250, April 2011. [24] Islam, M.R. ; Yin Chai Wang ; Khatun, A., Partial iris image recognition using wavelet based texture features, International Conference on Intelligent and Advanced Systems (ICIAS), pp. 1-6, 15-17 June 2010. [25] Xavier, L. ; Thusnavis, B.M.I. ; Newton, D.R.W. , "Content based image retrieval using textural features based on pyramidstructure wavelet transform", International Conference on Electronics Computer Technology (ICECT), pp. 79 83, 2011. [26] El-etriby, S.S. ; Amin, K.M. ,"Detection and correction of deformed historical arabic manuscripts", International Conference on Computer and Communication Engineering (ICCCE), 11-12 May 2010. [27] Anis Kricha, Amina Ghardallou Lasmar, Najoua Essoukri Ben Amara. Exploration des Ondelettes en Prtraitement des Documents Anciens, Colloque International Francophone sur l'Ecrit et le Document (CIFED), Fribourg, Suisse, 18-21 septembre 2006.
[11]
Anis Kricha is a PhD student at the Department of Electrical Engineering, in the National Engineering School of Tunis, University ElManar, Tunisia. He received his Electrical Engineer diploma from the National engineering School of Tunis, University El Manar, Tunisia. Since 2006, he is working as assistant professor in the Department of Electrical Engineering in the National Engineering School of Monastir, University of Monastir, Tunisia. His research interests are in the areas of image processing. Najoua Essoukri Ben Amara received the B.Sc., M.S., Ph.D. and HDR degrees in Electrical Engineering, Signal Processing, System Analysis and Pattern Recognition from the National Engineering School of Tunis, University El Manar, Tunisia, in 1985, 1986,
[12]
[13]
[14]
JOURNAL OF COMPUTING, VOLUME 3, ISSUE 10, OCTOBER 2011, ISSN 2151-9617 HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING WWW.JOURNALOFCOMPUTING.ORG
30
1999,2004 respectively. From 1985 to 1989, she was a researcher at the Regional Institute of Informatics Sciences and Telecommunications, Tunis, Tunisia. In September 1989, she joined the Electrical Engineering Department of the National National Engineering School of Monastir,University of Monastir, Tunisia, as an assistant professor. She becomes a senior lecturer in July 2004 and a Professor in October 2009 in Electrical Engineering at the National School of Engi-
neers of Sousse-ENISo, University of Sousse, Tunisia. Between July 2008 and july 2011, she was the Director of the ENISo. Her research interests include mainly pattern recognition applied to Arabic documents, ancient image processing, compression, watermarking, segmentation, biometric and the use of stochastic models and hybrid approaches in the above domains. She is the responsable of the research unit SAGE: Systmes Avancs en Gnie Electrique.