Writer Identification Of Handwritten Oriya Script

Barid Baran Nayak

Dept.of ECE
NIT Rourkela,India

Partha Pratim Roy

Umapada Pal

Dept. of CS


IIT Roorkee,India

ISI Kolkata,India

In handwriten writer identification and character recognition we have
done a image based analysis,where a scanned digital image containing
handwriten script is taken as input, then system translate it into an
machine editable readable digital text format. oriya language present
great challenges due to the large number of letters in alphabet set,the
sophisticated ways in which they combine and many letters are
roundish and similar to look .
In this project an attempt is made to recognize the writers by use of
HISTOGRAM OF GRADIENT features of character image. The features so
obtained are passed through the HMM code which gives out the
identification result.

Keywords: character recognition.writer identification,histogram of

gradient,Hidden Markov Model(HMM)

INTRODUCTION:Oriya is one of the many official languages in India; it is the official

language of Odisha and the second official language of Jharkhand.
Since it is an old language there are various old documents present
whose writers are unknown. My project deals with this problem. Its
main aim is to identify who is the writer. And the Other part of the
project is to identify each character written.
Due to the presence of complex features such as headline, vowels,
modifiers, etc., character segmentation in Oriya script is not easy. Also,
the position of vowels and compound characters make the
segmentation task of words into characters very complex. To take care
of this problem we tried a novel method considering a zone wise break
up of words and next HMM based recognition. In particular, the word
image is segmented into 3 zones, upper, middle and lower,
respectively. The components in middle zone are modelled using HMM.
By this zone segmentation approach we reduce the number of distinct
component classes compared to total number of classes in Oriya
character set. Once the middle zone portion is recognized, HMM based
forced alignment is applied in this zone to mark the boundaries of
individual components. The segmentation paths are extended later to
other zones Next, the residue components, if any, in upper and lower
zones in their respective boundary are combined to achieve the final
word level recognition.
Earlier template based approach was followed for recognition purpose.
In this approach an unknown pattern was superimposed on the ideal
template is done, and then the degree of correlation between the two

was used for the classification. But this approach became ineffective
because of noises and changes in hand writing. Hence now a days
feature based approach is used.


Figure1: original oriya script

Line segmentation:

Word segmentation:

Zone segmentation

Figure 4: (a) Original Word. (b) Zone segmented word


Figure 5: character segmentation from words.

The writer identification of writer was successfully carried out and
significant results were obtained. A scheme for segmentation of
unconstrained Oriya handwritten text into lines, words and characters is
proposed in this paper. Here, at first, the text image is segmented into
lines, and then lines are segmented into individual words. Next, for
character segmentation from words, initially, isolated and connected

(touching) characters in a word are detected. Using structural,

topological and water reservoir concept-based features, touching
characters of the word are then segmented into isolated characters. To
the best of our knowledge,
this is the first work of its kind on Oriya text. The proposed water
reservoir-based approach can also be used for other Indian scripts where
touching patterns show similar behavior.







