Vous êtes sur la page 1sur 6

1392 Li et al.

/ J Zhejiang Univ SCI 2004 5(1l):1392-1397

Journal of Zhejiang University SCIENCE


ISSN 1009-3095
http://www.zju.edu.cn/jzus
E-mail: jzus@zju.edu.cn

An approach to offline handwritten Chinese character


recognition based on segment evaluation of adaptive duration*

LI Guo-hong ( ~ [~;~)t, SHI Peng-fei ( / ~ , )


(Institute of lmage Processing and Pattern Recognition, Shanghai Jiaotong University, Shanghai 200030, China)
tE-mail: lgh0929@sjtu.edu.cn
Received May 7, 2003; revision accepted Aug. 18, 2003

A b s t r a c t : Thispaper presents a methodology for off-line handwritten Chinese characterrecognition based on mergence of
consecutive segments of adaptive duration. The handwritten Chinese character string is partitioned into a sequence of
consecutive segments, which are combined to implementdissimilarity evaluation within a sliding window whose durations
are determined adaptivelyby the integration of shapes and context of evaluations. The average stroke width is estimated for
the handwritten Chinese character string, and a set of candidate character segmentation boundaries is found by using the
integration of pixel and stroke features. The final decisions on segmentation and recognition are made under minimal
arithmetical mean dissimilarities. Experiments proved that the proposed approach of adaptive duration outperforms the
method of fixed duration, and is very effective for the recognition of overlapped, broken, touched, loosely configured
Chinese characters.

Key words: Handwritten Chinese character, Segmentationboundary, Segment, Duration


doi:10.1631/jzus.2004.1392 Document code: A CLC n u m b e r : TP391.4

INTRODUCTION loosely configured character (Fig.l e), and mixed


script (Fig.l f), segmentation is difficult. Over-
Offiine handwritten Chinese character reading lapped, broken, touched and loosely configured
by computer program is a complicated task. It is characters are major causes of segmentation errors.
essential to separate a given character string cor- Previous approaches for Chinese character
rectly into the sequence of characters. Any failure recognition mainly fall into straight segmentation
or error in the segmentation'step directly produces a and recognition method, wherein script is separated
negative effect on recognition (Lee and Lee, 1996). into a sequence of characters with each character
The difficulty of handwritten Chinese char- discriminated respectively (Kahan et al., 1987). Its
acter segmentation comes from the great variety of limitation is its dependence on high accuracy of
handwritings. In the case of hand-printed script segmentation. But such high accuracy segmenta-
(Fig. la), segmentation is a relatively simple task. In tion is very difficult for unconstrained handwritten
the case of overlapped script (Fig.lb), broken Chinese characters; so that script segmentation and
character (Fig.lc), touched characters (Fig.ld), character recognition must be combined. Recent
developments in handwritten English character
recognition by recognition-based segmentation can
" Project (No. 60075007) supported by the National Natural Science be found in (Nafiz and Fatos, 2002; Favata, 2001).
Foundation of China In this recognition-based method, a group of candi-
Li et al. /JZhejiang Univ SCI 2004 5(11):1392-1397 1393

date segmentation boundaries are found, and the determined by searching the path of minimal ar-
candidate boundaries are confirmed by recognition ithmetical mean dissimilarity.
results. This method is more reasonable than the
first one for handwritten character recognition, but
it highly depends on the performance of the rec- ESTIMATION OF AVERAGE STROKE WIDTH
ognizer (Liang et al., 1994).
This paper's proposed recognition-based It is assumed that stroke width varies locally
methodology for handwritten Chinese character depending on writing device, paper texture and
segmentation and recognition makes the best use of pressure within a script. Therefore, it is reasonable
shape features and recognition results. For the to get average stroke width and use it for subse-
mergence of consecutive segments, a method of quent image processing procedures in adaptive
variable duration for each character based on sta- manner. We detect components of small size as
tistics for English characters is described in (Kim noise, positions of small stroke width as candidate
and Govindaraju, 1997). In this paper, the duration segmentation points, and shapes of some range of
for each character is determined adaptively by stroke width as components of characters.
shape and evaluation. To estimate the average stroke width of a
The set of candidate segmentation boundaries given handwritten character string (as shown in
is crucial to handwritten Chinese character recog- Fig.2a), MSW, contours of that image are extracted.
nition. We make the following assumption: the By tracing contours from the left-most column to
correct segmentation boundaries are embedded in the. right-most column, the following distances (as
the set of candidate segmentation boundaries. shown in Fig.2b) for each column are computed: (1)
In order to verify the proposed method, the distance between upper trace and lower trace of the
average stroke width is computed, then a set of outer contour, (2) distance between lower trace and
candidate segmentation boundaries is explored, and upper trace of the inner contour, (3) distance be-
segment-level evaluation based on combination of tween upper trace of inner contour and upper trace
consecutive components of adaptive duration is of the outer contour, and (4) distance between
described. A reduced directed graph is constructed lower trace of inner contour and lower trace of the
according to the sequence of evaluation results. The outer contour.
final segmentation and recognition decisions are The histogram in Fig.2c shows the number of

(a)
(a) (b)

(h)
600
(c) (d)
400
300

(e) (f)
I 200
,~176
0
1
,-.~.N
3
. . . . . .
5
(c)
7 9
N.m.~.m
11 13

Fig.1 Types of handwriting Fig.2 Estimation of stroke width


(a) Hand-printed; (b) Overlapped; (c) Broken; (d) Touched; (e) (a) Original image of text line; (b) Contour from image of text line;
Loosely configured; (f) Mixed (c) Histogram of stroke width
1394 Li et aL /JZhejiang Univ SC1 2004 5(11):1392-1397

occurrences of each distance value for the image of by the reduced nonlinear segmentation algorithm,
Fig.2b. The maximum of the histogram indicates an which split the connected characters or components.
estimation of the average stroke width. Fig.3b shows the candidate segmentation bounda-
We define a set of candidate segmentation ries from reduced nonlinear segmentation algo-
points rithm.

SPMSW = {plmsw(p)<a.MSW} (1)


SEGMENT-LEVEL EVALUATION
where m s w ( p ) is the stroke width at pointp, and the
coefficient, a (0<a<l) is set empirically. Those candidate segmentation boundaries
Generally, those stroke widths in points of a explored by the above reduced nonlinear segmen-
ligature are obviously smaller than average stroke tation algorithm often partition a single character
width. Therefore, it is possible that some segmen- into several components. In order to confirm the
tation points between touched characters are em- character segmentation boundaries and recognition
bedded in S P M S W . results, the scheme based on evaluations of merged
segments of adaptive durations is undertaken. The
evaluations are represented as directed evaluation
DETERMINATION OF CANDIDATE SEGMEN- graph, and the optimal character segmentation
TATION BOUNDARIES boundaries are confirmed by exploring a minimal
mean cost path in the evaluation graph.
To split handwritten Chinese characters into a The component between two successive
sequence of consecutive segments, we apply a re- candidate segmentation boundaries is defined as a
duced nonlinear segmentation algorithm to the text segment. The combinations of consecutive seg-
image. The nonlinear segmentation algorithm (Lee ments are fed to recognizer, and a series of dis-
and Lee, 1996; Nafiz and Fatos, 2002) performs the similarity evaluation results is obtained for a se-
segmentation on the gray-scale image, but our re- quence of combinations.
duced algorithm implements segmentation just on Suppose the set of candidate segmentation
binary image. boundaries can be denoted as
First, the potential segmentation regions are
explored by analyzing the variation locally on pixel SB={spo, spl . . . . . SpN} (2)
and stroke projection profiles (the numbers of oc-
currences of black pixels and strokes in each col- where N is the number of segments in character
umn, respectively). Determination of the potential string and spk is the kth candidate segmentation
segmentation regions in a string image is imple- boundary.
mented in the following ~teps: Suppose a handwritten Chinese character string
(1) A column is labeled a non-segmentation-
region if its stroke projection value is locally
minimal, and its pixel projection value is locally
maximal.
(2) For all unlabeled columns so far, those (a)
between two successive non-segmentation-regions
belong to a segmentation region.
The potential segmentation regions carry
(b)
candidate segmentation boundaries between char-
Fig.3 Result af pre-segmentatian
acters. One candidate segmentation boundary at (a) Originaltext lineimage;(b) Candidateboundariesby nonlinear
most in each potential segmentation region is found segmentationalgorithm
Li et el./JZhejiang Univ SCI 2004 5(11):1392-1397 1395

can be denoted as D; =D(cs(s,e)) (6)

CC={sO,S1. . . . . SN_I} (3)


Fd l
where sk is the kth segment, and sk denotes the where D~ =]ri~ ] , and rf is the resulting character
component between candidate boundaries sp, and
spk+l. corresponding to the minimal dissimilarity, d e.
Suppose the original text line can be referred Generally, an image of merged consecutive
to as segments is potentially a character pattern when
d e < Ta. The coefficient Ta is empirically set, gen-
TL={ Co,Cl..... CM-I} (4)
erally the maximal from statistics. The merge of
segments from s to e is not considered a candidate
where M is the number of characters in character
string and c~ is the kth character. character pattern when d e_>Ta. An image of
Our endeavor is to find such a subset from the merged consecutive segments cs(s,e) would not be
set SB whose elements separate the text image into thought of as character pattern when d; > Ta~d;-1 or
M segments that the arithmetical mean of evalua-
d e+l < Ta2ds e . The coefficients Tal (>1) and Ta2 (<1)
tions of those segments is minimal.
are empirically set.
We define the notion of merged segments as
Then we get the set of recognition results of
the sequence of merged segments
cs(s,e)={Ss~Ss-l~...~Se-l} (5)

where s is the start segmentation boundary of S D : {D~,D~,...Dswo,D(,D~,...


~Sgt] +I, i-]pU-I i-~U i-]pU
merged segments, and e is the end boundary of ' ~ N - 2 ' ~ N - 2 ~~ N - I } (7)
merged segments.
Generally, an image of merged consecutive Where SWi denotes the maximum of consecutive
segments of width csw and height csh is potentially segments from starting ith segment to be combined.
a character pattern when csw<Thl.CSh. The coeffi- Fig.4c represents the evaluation results of
cient Thl is empirically set, generally the maximal merged segments of adaptive duration in the form of
from statistics. Therefore, the mergence of seg- directed graph. The nodes denote candidate seg-
ments from s is terminated when csw>-Thl'csh. An mentation boundaries, the edges signify the merged
image of segment of mass csm could be thought of segments, and the weights indicate evaluation results,
as a block of noise when csm<Tm.MSW2. Therefore, dissimilarities. Fig.4b shows the evaluation results
an image of segment of mass csm can be thought of using fixed duration. Obviously, the graph of adap-
as a character pattern or one component of char- tive duration accelerates the searching process for
acter pattern when csm>_Tm.MSW2. An image of final segmentation and recognition decision.
segment of width csw and height csh is potentially a
character pattern when csw>Th2.MSW or
csw>Th3.MSW. Therefore, image of merged con- CHARACTER SEGMENTATION AND RECOG-
secutive segments of width csw and height csh NITION
would not be considered as a character pattern
when csw<_Th2.MSWand csh<-Th3"MSW. The coef- The final decisions on segmentation and rec-
ficients ThE and Th3 are empirically set, generally ognition are made under minimal arithmetical mean
the minimal from statistics. dissimilarity that is more reasonable than minimal
We define the evaluation result of merged accumulated dissimilarity. We define a path as such
segments as a set of edges that the ending node can be reached
1396 Li et al. /J Zhejiang Univ SC1 2004 5(11).'1392-1397

The dashed edges shown in Fig.4b and Fig.4c


indicate the optimal character segmentation paths.
Based on the minimal mean dissimilarity segmen-
Ca) tation path, the segmentation boundaries (i.e. the
nodes on the optimal segmentation path) are picked
up. The bounding boxes of characters between two
successive segmentation boundaries are drawn with
straight lines (Fig.4d). The evaluation results on
merged segments in bounding boxes turn out the
final recognition results.
(b)
Suppose the minimal mean dissimilarity seg-
mentation path can be referred to as:

pm={no,nl . . . . ,riM} (1 1)

(c) where ni is the ith node on the path of minimal mean


dissimilarity.
The values of dissimilarity d " ' and character

(d) r,~'-' are computed as follows:


Fig.4 Decision on segmentation and recagnition
(a) Candidate segmentation boundaries; (b) Evaluation graph
of fixed duration; (c) Evaluation graph of adaptive duration; d'"
n,
= 0 min
~k~
d(V(cs(ni,n,i),VT,) (12)
(d) Final segmentation decision
~'"
i
=arg 0 min
~k~l'~
d(V(cs(n,,ni+l)),VTk) (13)
from starting node by consecutively traversing
them. The sequence of possible paths on the where d(.) indicates dissimilarity computation
evaluation graph can be denoted as
between two feature vector, operator V(.) performs
feature extraction for merged segments, VTk is the
P= {pk]k=-0,1..... P N - 1 } (8)
feature vector of the kth character template, and VN
is the number of character templates.
Where pk is a path and P N is the number of possible Fig.4d shows the final decision on segmenta-
paths in the evaluation graph. Let d/ be the dis- tion.
similarity from node i to node j, E(pk) be a set of
edges on the path Pk, and ~E(pk)] be the number of
edges in E(pk). The mean dissimilarity of path pk EXPERIMENTAL RESULTS AND DISCUSSION
can be computed as follows:
Experiments for handwritten Chinese charac-
Z
<i.j>eE(p~ )
d/ ter string segmentation and recognition were car-
(9) fled out with text line images from 500 Chinese
d(p,)- fE(p ,)[
mail address lines. All the images are binary. Those
lines contain many overlapped, touched, broken,
The optimal character segmentation path can
and loosely configured characters. The input im-
be confirmed by searching the path of minimal
ages to these experiments are binary without any
mean dissimilarity,
operations of smoothing and correction.
Given a script line image, the result of the
p,. = arg min{d(pk )}
Pk (10) candidate segmentation boundaries is considered to
Li et al. /JZhejiang Univ SCI 2004 5(11):1392-1397 1397

be correct if there exists a path from source to sink on of handwritings decreases the accuracy of
in the word graph. There are almost no cases not evaluation, which negatively affects the final rec-
completely partitioned in those text line images. ognition decision.
The final character segmentation and recognition
decisions are highly inter-related, and the correct
recognition accuracy of the top 5 candidate char- CONCLUSION
acters for this kind of script lines is 93%. The cor-
rect recognition accuracy of address words is We encountered difficulties from broken,
greater than 90%. overlapped, touched, and loosely configured char-
Because of the variety of character shapes acters for handwritten Chinese character segmen-
from handwriting styles, we make decision about tation and recognition. In this paper, we proposed a
segmentation and recognition based on the evalua- new strategy for handwritten Chinese character
tion of merged segments, instead of the charac- string segmentation and recognition that combines
terization of such character shapes as aspect ratio consecutive segments within a sliding window of
and width. The proposed strategy made a recogni- adaptive duration. The segmentation and recogni-
tion solution for not only touched and overlapped tion are highly inter-related. We have proved that
characters (as shown in Fig.5 and Fig.4), but also many of errors from these problems can be tackled
the combination of broken and loosely configured by the proposed approach.
characters (as shown in Fig.4). The methodology is There are still some ways to improve the
effective for segmentation and recognition decision segmentation and recognition: (1) exploiting con-
on the above characters, and greatly improved the text information from a lexicon, and (2) segmenting
recognition performance. the string into a sequence of finer segments will
Obviously, the proposed strategy of adaptive certainly improve the segmentation, while bringing
duration requires much less expensive computation about more expenses in computation.
resource than that of fixed duration, and improves
the performance of segmentation and recognition References
because of the optimization rule of minimal mean Favata, J.T., 2001. Offiine general handwritten word rec-
dissimilarity. ognition using an approximate BEAM matching algo-
rithm. I E E E Trans. on Pattern Recognition and Ma-
There are still main sources of segmentation chine Intelligence, 23(9): 1009-1021.
failure to be identified: (1) multi-touched character Kahan, S., Pavlidis, T., Baird, H.S., 1987. On the recogni-
string, (2) the accuracy of evaluations. The variati- tion of printed characters of any fonts and sizes. I E E E
Trans. on Pattern Recognition and Machine Intelli-
gence, 9(2):274-288.
Kim, G., Govindaraju, V., 1997. A lexicon driven approach
to handwritten word recognition for real-time appli-
cations. I E E E Trans. on Pattern Recognition and
(a) Machine Intelligence, 19(4):366-379.
Lee, S.W., Lee, D.J., 1996. A new methodology for
gray-scale character segmentation and recognition.
I E E E Trans. on Pattern Recognition and Machine In-
(b) telligence, 18(10): 1045-1050.
Liang, S., Shridhar, M., Ahmadi, M., 1994. Segmentation of
touching characters in printed document recognition.
Pattern Recognition, 27(6):825-840.
Nafiz, A., Fatos, T.Y., 2002. Optical character recognition
(c) for cursive handwriting. I E E E Trans. on Pattern
Fig.5 Experimental results Recognition and Machine Intelligence, 24(6):801-813.
(a) Text line image; (b) Candidate segmentation bounda-
ries; (c) Final segmentation and recognition result

Vous aimerez peut-être aussi