Vous êtes sur la page 1sur 40

Online Handwritten Cursive Word Recognition

Abstract

This paper describes an online handwritten cursive word recognition approach by combining
segmentation-free and segmentation-based methods. To search the optimal segmentation and
recognition path as the recognition result, we can attempt two methods: segmentation-free and
segmentation-based, where we expand the search space using a character-synchronous beam
search strategy. The probable search paths are evaluated by integrating character recognition
scores with geometric characteristics of the character patterns in a Conditional Random Field
(CRF) model. We make a comparison between online handwritten cursive word recognition
using segmentation-free method and that using segmentation-based method, and then attempts to
combine the two methods to improve performance. Our methods restrict the search paths from
the tree lexicon of words and preceding paths during path search. We show this comparison on a
publicly available data set (IAM-OnDB).

[Type text]
Chapter I

Introduction

The development of pen-based or touch-based input devices such as tablets and smart phones has
led to a push towards more fluid interactions with these electronics. Realizing online handwritten
character recognition with high performance is vital, especially for applications such as the
natural input of text on smart phones, to provide a satisfactory user experience. Without
character recognition cues, characters cannot be segmented unambiguously due to the fact that
segmentation points between characters are not obvious. A feasible way to overcome the
ambiguity of segmentation is called integrated segmentation and recognition, which is classified
into segmentation-free and segmentation-based methods. The segmentation-based method
attempts to split cursive words into character patterns at their true boundaries and label the split
character patterns. A word pattern is oversegmented into primitive segments such that each
segment comprises of a single character or part of a character. The segments are combined to
generate candidate character patterns (forming a candidate lattice), which are evaluated using
character recognition, incorporating geometric and linguistic contexts.

[Type text]
Chapter II

System Analysis

System analysis is the overall analysis of the system before implementation and for arriving at a
precise solution. Careful analysis of a system before implementation prevents post
implementation problems that might arise due to bad analysis of the problem statement.

Thus the necessity for systems analysis is justified. Analysis is the first crucial step, detailed
study of the various operations performed by a system and their relationships within and outside
of the system. Analysis is defining the boundaries of the system that will be followed by design
and implementation.

Existing System

On the other hand, the segmentation-free method avoids problems associated with segmentation,
and utilizes the abilities of one-dimensional structural models such as HMMs or MRFs for
concatenating character models to construct word models based on the provided lexicon of
words during recognition, and erases and selects segmentation points using character models.
Online handwritten word recognition using a recurrent neural network performs word
recognition by continuously moving the input window of the network across the frame sequence
representation of a word, thus generating activates traces at the output of the network. These
output traces are subsequently examined to determine the ASCII string(s) best representing the
word image. It has got a lot of attention and discussion which is better of the segmentation-free
and segmentation-based methods for handwritten string recognition.

[Type text]
Proposed System

In this paper, we make a comparison between them for online handwritten cursive word
recognition, and then attempt to combine the two methods to improve performance. We try to
combine the two methods to search the optimal path as the recognition result. We expand the
search space using a character-synchronous beam search strategy and the probable search paths
are evaluated by a path evaluation criterion in a CRF model. To evaluate character patterns, we
combine a MRF model with a P2DBMN-MQDF (pseudo 2D bi-moment normalization -
modified quadratic discriminant function) recognizer. The rest of this paper is organized as
follows: Section 2 begins with the description of the preprocessing steps. Section 3 describes the
CRFs for our word recognition, Section 4 presents the two search methods (segmentationfree and
segmentation-based) and their combination, Section 5 presents the experimental results, and
Section 6 makes our conclusions.

Proposed System Algorithm

The SA approach is a generic and probabilistic meta-heuristic method for solving difficult
optimization problems. It can solve combinatorial optimization problems in a large search space;
that is, it can find good approximated solutions to the global optimums by randomized heuristic
methods. In this section, we develop a SA-based algorithm to solve the proposed mathematical
optimization formulation (IP2) for the GAS problem.

Contrast enhancement

Contrast enhancement techniques have been widely used in many applications of image
processing where the subjective quality of images is important for human interpretation. Contrast
is the difference in visual properties that makes an object (or its representation in an image)
distinguishable from other objects and the background. In visual perception of the real world,
contrast is determined by the difference in the color and Brightness of the object and other
objects within the same field of view. In other words, it is the difference between the dark and

[Type text]
the lighter pixel of the image, if it is big the image will have high contrast and in the other case
the image will have low contrast.

Contrast is an important factor in any subjective evaluation of image quality. We apply this fact
effectively to design a contrast enhancement method for images that improves the local image
contrast by controlling the local image gradient. We pose contrast enhancement as an
optimization problem that maximizes the average local contrast of an image. The optimization
formulation includes a perceptual constraint derived directly from human suprathreshold contrast
sensitivity function.

[Type text]
Chapter III

Feasibility Study

The preliminary investigation examines project feasibility; the likelihood the system will be
useful to the organization. The main objective of the feasibility study is to test Technical,
Operational and Economical feasibility for adding new modules and debugging oldest running
system. All systems are feasible if they are given unlimited resources and infinite time. There are
aspects in the feasibility study portion of the preliminary investigation:

Operational Feasibility

The application smart audit does not require additional manual involvement or labor towards
maintenance of the system. The Cost of training is minimized due to the user friendliness of the
developed application. Recurring expenditures on consumables and materials are minimized.

Technical Feasibility

Keeping in mind the existing system network, software & Hardware, already available the audit
application generated in java have provided an executable file that requires tomcat that provides
compatibility from windows98 without having to load java software. No additional hardware or
software is required which makes smart audit technically feasible.

Economic Feasibility

The system is economically feasible keeping in mind:


 Lesser investment towards training.
 One time investment towards development.
 Minimizing recurring expenditure towards training, facilities offered and Consumables.
 The system as a whole is economically feasible over a period of time.

[Type text]
Chapter IV
System Design

System design concentrates on moving from problem domain to solution domain. This
important phase is composed of several steps. It provides the understanding and procedural
details necessary for implementing the system recommended in the feasibility study. Emphasis
is on translating the performance requirements into design specification.

The design of any software involves mapping of the software requirements into Functional
modules. Developing a real time application or any system utilities involves two processes. The
first process is to design the system to implement it. The second is to construct the executable
code.

Software design has evolved from an intuitive art dependent on experience to a science, which
provides systematic techniques for the software definition. Software design is a first step in the
development phase of the software life cycle.

Before design the system user requirements have been identified, information has been gathered
to verify the problem and evaluate the existing system. A feasibility study has been conducted to
review alternative solution and provide cost and benefit justification. To overcome this proposed
system is recommended. At this point the design phase begins.

The process of design involves conceiving and planning out in the mind and making a drawing.
In software design, there are three distinct activities: External design, Architectural design and
detailed design. Architectural design and detailed design are collectively referred to as internal
design. External design of software involves conceiving and planning out and specifying the
externally observable characteristics of a software product.

INPUT DESIGN:

Systems design is the process of defining the architecture, components, modules, interfaces, and

[Type text]
data for a system to satisfy specified requirements. Systems design could be seen as the
application of systems theory to product development. There is some overlap with the disciplines
of systems analysis, systems architecture and systems engineering.

Input Design is the process of converting a user oriented description of the inputs to a computer-
based business system into a programmer-oriented specification.

• Input data were found to be available for establishing and maintaining master and
transaction files and for creating output records

• The most suitable types of input media, for either off-line or on-line devices, where
selected after a study of alternative data capture techniques.

INPUT DESIGN CONSIDERATIONS

• The field length must be documented.

• The sequence of fields should match the sequence of the fields on the source document.

• The data format must be identified to the data entry operator.

Design input requirements must be comprehensive. Product complexity and the risk associated
with its use dictate the amount of detail

• These specify what the product does, focusing on its operational capabilities and the
processing of inputs and resultant outputs.

• These specify how much or how well the product must perform, addressing such issues
as speed, strength, response times, accuracy, limits of operation, etc.

OUTPUT DESIGN:

A quality output is one, which meets the requirements of the end user and presents the
information clearly. In any system results of processing are communicated to the users and to
other system through outputs.

In output design it is determined how the information is to be displaced for immediate need and
also the hard copy output. It is the most important and direct source information to the user.

[Type text]
Efficient and intelligent output design improves the system’s relationship to help user decision-
making.

1. Designing computer output should proceed in an organized, well thought out manner;
the right output must be developed while ensuring that each output element is designed so
that people will find the system can use easily and effectively. When analysis design
computer output, they should Identify the specific output that is needed to meet the
requirements.

2. Select methods for presenting information.

3. Create document, report, or other formats that contain information produced by the
system.

The output form of an information system should accomplish one or more of the following
objectives.

• Convey information about past activities, current status or projections of the

• Future.

• Signal important events, opportunities, problems, or warnings.

• Trigger an action.

• Confirm an action.

[Type text]
Architecture Diagram

System Architecture

we apply a segmentation-free MRF model of one-dimensional structure for online handwritten


English cursive word recognition. It extracts feature points along the pen-tip trace from pen-
down to pen-up. It uses the feature point coordinates as unary features and the differences in
coordinates between the neighboring feature points as binary features. Each character is modeled
as a MRF, and word MRFs are constructed by concatenating character MRFs according to a trie
lexicon of words during recognition. It expands the search space using a character-synchronous
beam search strategy to search the segmentation and recognition paths. This method restricts the
search paths from the trie lexicon of words and preceding paths, as well as the lengths of feature
points during path search.

[Type text]
Dataflow Diagram

The Data Flow Diagram is a graphical model showing the inputs, processes, storage & outputs of
a system procedure in structure analysis. A DFD is also known as a Bubble Chart

The Data flow diagram provides additional information that is used during the analysis of the
information domain, and server as a basis for the modeling of functions. The description of each
function presented in the DFD is contained is a process specification called as PSPEC.

DFD Symbols

Data Flow
Arrows marking the movement of data through the system indicate data flows. It is the pipeline
carrying packets of data from an identified point of origin to specific destination.

Process
Bubbles or circles are used to indicate where incoming data flows are processed and then
transformed into outgoing data flows. The processes are numbered and named to indicate the
occurrence in the system flow.

External Entity
A rectangle indicates any source or destination of data. The entity can be a class of people, an
Organization or even another system. The function of the external entity is to, supply data to
Receive data from the system. They have no interest in how to transform the data.

[Type text]
Data Store
A data store denoted as open rectangles. It is observed that programs and sub systems have
complex interdependencies including flow of data, flow of control and interaction with data
stores. It is used to identify holding points.

start

Read Image

Segment the image In to


different partitions

Find Text region

Invert process in Image

Boundary of Image

stop
Dataflow diagram

[Type text]
Chapter V

Literature Survey

S.no Title Year Methodology Disadvantage


HMM-based on-line For classification, Hidden presented on-line
recognition of Markov Models are used recognition system with
1. 2006
handwritten Together with a statistical the off-line recognizer of
language model. From such a combination,
whiteboard
an improved recognition
performance can be
expected.

OCR technology— Current generation OCR


scanning, recognition, and systems provide very
2. Character recognition 2007
good accuracy and
reading text. Initially, a
systems formatting capabilities at
printed document is prices that are up to ten
scanned by a camera. times.

A novel approach to 2007 bidirectional recurrent problem of Out-Of-


on-line handwriting neural network with the Vocabulary words (OOV).
3.
recognition based on long short-term memory This could be done by
bidirectional long architecture. using the network
short-term memory likelihood and the edit
networks distance to the nearest
vocabulary word.

[Type text]
Markov random field This cannot be achieved
On-line handwritten (MRF) model with by incorporating more
4. 2011
Japanese character conditional random field effective unary and binary
recognition (CRF) . features and exploiting
better
optimization by CRF
weighting parameters.
Speeding up recognition is
another recognition.

An approach for real- 2012 multiple contexts including The nearest prototype
character classification, classifier cannot be a
5. time recognition of
linguistic context and good choice The
online Chinese implementation of
geometric context.
language association
handwritten
module would vastly
sentences promote the text input
efficiency, because many
characters can be input
automatically without
writing.

[Type text]
Chapter VI

System Requirements

The hardware and software specification specifies the minimum hardware and software required
to run the project. The hardware configuration specified below is not by any means the optimal
hardware requirements. The software specification given below is just the minimum
requirements, and the performance of the system may be slow on such system.

Hardware Requirements

 Machine: Intel CORE i3


 RAM: 256MB

Software Requirements

 Operating System: Windows


 Front End: Net beans

 Back End: MySQL

[Type text]
Chapter VII

System Implementation
Implementation is the stage in the project where the theoretical design is turned into a working
system. The implementation phase constructs, installs and operates the new system. The most
crucial stage in achieving a new successful system is that it will work efficiently and effectively.
There are several activities involved while implementing a new project.
• End user Training
• End user Education
• Training on the application software

Modules

The “Online handwritten Cursive Word Recognition” consists of five main modules.

• Upload dataset
• Gray scale conversion
• Find the Text Region
• Invert of the image
• Identify the boundary of the image

Modules Description

Upload dataset

Upload the dataset , Digitization is the process of converting the paper based handwritten
document into electronic form. The electronic conversion is accomplished using a process
whereby a document is scanned and an electronic representation of the original document, in the
form of a bitmap image, is produced. Digitization produces the digital image, which is fed to the
preprocessing phase.

[Type text]
Gray scale conversion

The grayscale word images, extracted from the database, are first converted into their respective
binarized images as mentioned earlier using an adaptive thresholding technique. The adaptive
threshold value is considered as the average of the maximum and minimum gray scale values of
the respective images. Design of a feature set has been always a challenging issue in the pattern
recognition domain . The challenge gets incremented if the documents are handwritten.

Find the Text Region

Each of the isolated words is subdivided into eight horizontal imaginary regions Selection of
these regions is based on the basic characteristics . Experimental analysis reveals that in most of
the words, character components or modified shapes extend above the headline or lie below the
actual characters. The components that extend over head-line are called ascendants and the
character components or modified shapes that lie below the actual character are called
descendents.

Invert of the image

we applied a series of invert operations to enhance the text image, this operation includes noise
reduction, skew detection, line extraction and normalization. Then, the most important step is
feature extraction. each line image is transferred into a sequence of characteristics with a use of
vertical sliding windows along the line image. The training procedure is started with the
sequence of feature vectors and the corresponding transcriptions of handwritten text, and in
recognition procedure the Viterbi Algorithm is used for output the recognized text lines.

Identify the boundary of the image

[Type text]
Shortest path to pass through the white pixels without cutting any boundary of the characters if
possible. The intensity values in gray-level is weighted by the y coordinate of the pixel.Between
two pixels with the same intensity values, the pixel whose coordinate is lower than the other is
given a higher weight. If the path is required to cut any stroke in the segmented region, it cuts
the stroke, which is closest to the bottom . The character contours in the boundary image are
represented by black pixels and weighted by the estimated stroke width. The weight given to
each boundary pixel enforces the path to cut the minimum number of strokes . Therefore, the
segmentation path is optimal when it goes through the common stroke, thus separating two
joined characters.

[Type text]
Chapter VIII
Software Description
Introduction

JAVA

Initially the language was called as “oak” but it was renamed as “Java” in 1995. The primary
motivation of this language was the need for a platform-independent (i.e., architecture neutral)
language that could be used to create software to be embedded in various consumer electronic
devices.

 Java is a programmer’s language.

 Java is cohesive and consistent.

 Except for those constraints imposed by the Internet environment, Java gives
the programmer, full control.
Finally, Java is to Internet programming where C was to system programming.

Importance of JAVA to the Internet


Java has had a profound effect on the Internet. This is because; Java expands the Universe of
objects that can move about freely in Cyberspace. In a network, two categories of objects are
transmitted between the Server and the Personal computer. They are: Passive information and
Dynamic active programs. The Dynamic, Self-executing programs cause serious problems in the
areas of Security and probability. But, Java addresses those concerns and by doing so, has
opened the door to an exciting new form of program called the Applet.

JAVA can be used to Create Two Types of Programs


Application and Applet:

An application is a program that runs on our Computer under the operating system of that
computer. It is more or less like one creating using C or C++. Java’s ability to create Applets
makes it important. An Applet is an application designed to be transmitted over the Internet and
executed by a Java –compatible web browser. An applet is actually a tiny Java program,

[Type text]
dynamically downloaded across the network, just like an image. But the difference is, it is an
intelligent program, not just a media file. It can react to the user input and dynamically change.

Features of JAVA

Security
Every time you that you download a “normal” program; you are risking a viral infection. Prior to
Java, most users did not download executable programs frequently, and those who did scan them
for viruses prior to execution. Most users still worried about the possibility of infecting their
systems with a virus. In addition, another type of malicious program exists that must be guarded
against. This type of program can gather private information, such as credit card numbers, bank
account balances, and passwords. Java answers both these concerns by providing a “firewall”
between a network application and your computer. When you use a Java-compatible Web
browser, you can safely download Java applets without fear of virus infection or malicious
intent.

Portability

For programs to be dynamically downloaded to all the various types of platforms connected to
the Internet, some means of generating portable executable code is needed .As you will see, the
same mechanism that helps ensure security also helps create portability. Indeed, Java’s solution
to these two problems is both elegant and efficient.

The Byte Code

The key that allows the Java to solve the security and portability problems is that the output of
Java compiler is Byte code. Byte code is a highly optimized set of instructions designed to be
executed by the Java run-time system, which is called the Java Virtual Machine (JVM). That is,
in its standard form, the JVM is an interpreter for byte code.

Translating a Java program into byte code helps makes it much easier to run a program in a wide
variety of environments. The reason is, once the run-time package exists for a given system, any
Java program can run on it.

Although Java was designed for interpretation, there is technically nothing about Java that
prevents on-the-fly compilation of byte code into native code. Sun has just completed its Just In

[Type text]
Time (JIT) compiler for byte code. When the JIT compiler is a part of JVM, it compiles byte
code into executable code in real time, on a piece-by-piece, demand basis. It is not possible to
compile an entire Java program into executable code all at once, because Java performs various
run-time checks that can be done only at run time. The JIT compiles code, as it is needed, during
execution

JAVA Virtual Machine (JVM)

Beyond the language, there is the Java virtual machine. The Java virtual machine is an important
element of the Java technology. The virtual machine can be embedded within a web browser or
an operating system. Once a piece of Java code is loaded onto a machine, it is verified. As part of
the loading process, a class loader is invoked and does byte code verification makes sure that the
code that’s has been generated by the compiler will not corrupt the machine that it’s loaded on.
Byte code verification takes place at the end of the compilation process to make sure that is all
accurate and correct. So byte code verification is integral to the compiling and executing of Java
code.

JAVA Architecture

Java architecture provides a portable, robust, high performing environment for development.
Java provides portability by compiling the byte codes for the Java Virtual Machine, which is then
interpreted on each platform by the run-time environment. Java is a dynamic system, able to load
code when needed from a machine in the same room or across the planet.

Compilation of Code

When you compile the code, the Java compiler creates machine code (called byte code) for a
hypothetical machine called Java Virtual Machine (JVM). The JVM is supposed to execute the
byte code. The JVM is created for overcoming the issue of portability. The code is written and
compiled for one machine and interpreted on all machines. This machine is called Java Virtual
Machine.

[Type text]
JAVA Platform

One design goal of Java is portability, which means that programs written for the Java platform
must run similarly on any combination of hardware and operating system with adequate runtime
support. This is achieved by compiling the Java language code to an intermediate representation
called Java byte code, instead of directly to architecture-specific machine code. Java byte code
instructions are analogous to machine code, but they are intended to be executed by a virtual
machine (VM) written specifically for the host hardware. End users commonly use a Java
Runtime Environment (JRE) installed on their own machine for standalone Java applications, or
in a web browser for Java applets.

Standard libraries provide a generic way to access host-specific features such as graphics,
threading, and networking.

The use of universal byte code makes porting simple. However, the overhead of interpreting byte
code into machine instructions makes interpreted programs almost always run more slowly than
native executable. However, just-in-time (JIT) compilers that compile byte codes to machine
code during runtime were introduced from an early stage. Java itself is platform-independent,
and is adapted to the particular platform it is to run on by a Java virtual machine for it, which
translates the Java byte code into the platform's machine language.

Implementations

The Oracle implementation is packaged into two different distributions: The Java Runtime
Environment (JRE) which contains the parts of the Java SE platform required to run Java
programs and is intended for end users, and the Java Development Kit (JDK), which is intended
for software developers and includes development tools such as the Java compiler, Java doc, Jar,
and a debugger.

OpenJDK is another notable Java SE implementation that is licensed under the GNU GPL. The
implementation started when Sun began releasing the Java source code under the GPL. As of
Java SE 7, OpenJDK is the official Java reference implementation.

[Type text]
The goal of Java is to make all implementations of Java compatible. Historically, Sun's
trademark license for usage of the Java brand insists that all implementations be "compatible".
Platform-independent Java is essential to Java EE, and an even more rigorous validation is
required to certify an implementation. This environment enables portable server-side
applications.

Performance

Programs written in Java have a reputation for being slower and requiring more memory than
those written in C++.

Some platforms offer direct hardware support for Java; there are microcontrollers that can run
Java in hardware instead of a software Java virtual machine, and ARM based processors can
have hardware support for executing Java byte code through their Jazelle option (while its
support is mostly dropped in current implementations of ARM).

Automatic memory management

Java uses an automatic garbage collector to manage memory in the object lifecycle. The
programmer determines when objects are created, and the Java runtime is responsible for
recovering the memory once objects are no longer in use. Once no references to an object
remain, the unreachable memory becomes eligible to be freed automatically by the garbage
collector. Something similar to a memory leak may still occur if a programmer's code holds a
reference to an object that is no longer needed, typically when objects that are no longer needed
are stored in containers that are still in use. If methods for a nonexistent object are called, a "null
pointer exception" is thrown.

One of the ideas behind Java's automatic memory management model is that programmers can
be spared the burden of having to perform manual memory management. In some languages,
memory for the creation of objects is implicitly allocated on the stack, or explicitly allocated and
de-allocated from the heap. In the latter case the responsibility of managing memory resides with
the programmer. If the program does not de-allocate an object, a memory leak occurs. If the
program attempts to access or de-allocate memory that has already been de-allocated, the result

[Type text]
is undefined and difficult to predict, and the program is likely to become unstable and/or crash.
This can be partially remedied by the use of smart pointers, but these add overhead and
complexity. Note that garbage collection does not prevent "logical" memory leaks, i.e., those
where the memory is still referenced but never used.

Garbage collection may happen at any time. Ideally, it will occur when a program is idle. It is
guaranteed to be triggered if there is insufficient free memory on the heap to allocate a new
object; this can cause a program to stall momentarily. Explicit memory management is not
possible in Java.

Java does not support C/C++ style pointer arithmetic, where object addresses and unsigned
integers (usually long integers) can be used interchangeably. This allows the garbage collector to
relocate referenced objects and ensures type safety and security.

As in C++ and some other object-oriented languages, variables of Java's primitive data types are
either stored directly in fields (for objects) or on the stack (for methods) rather than on the heap,
as is commonly true for non-primitive data types (but see escape analysis). This was a conscious
decision by Java's designers for performance reasons.

MYSQL Description

Definition - What does MySQL mean?

MySQL is a full-featured relational database management system (RDBMS) that competes with
the likes of Oracle DB and Microsoft’s SQL Server. MySQL is sponsored by the Swedish
company MySQL AB, which is owned by Oracle Corp. However, the MySQL source code is
freely available because it was originally developed as freeware. MySQL is written in C and
C++ and is compatible with all major operating systems.

MySQL is an open source relational database management system (RDBMS) based on


Structured Query Language (SQL).

MySQL runs on virtually all platforms, including Linux, UNIX, and Windows. Although it can
be used in a wide range of applications, MySQL is most often associated with web-based

[Type text]
applications and online publishing and is an important component of an open source enterprise
stack called LAMP. LAMP is a Web development platform that uses Linux as the operating
system, Apache as the Web server, MySQL as the relational database management system and
PHP as the object-oriented scripting language. (Sometimes Perl or Python is used instead of
PHP.)

MySQL

MySQL is very popular for Web-hosting applications because of its plethora of Web-optimized
features like HTML data types, and because it's available for free. It is part of the Linux, Apache,
MySQL, PHP (LAMP) architecture, a combination of platforms that is frequently used to deliver
and support advanced Web applications. MySQL runs the back-end databases of some famous
websites, including Wikipedia, Google and Facebook- a testament to its stability and robustness.
Although MySQL is technically considered a competitor of Oracle DB, Oracle DB is mainly
used by large enterprises, while MySQL is used by smaller, more Web-oriented databases. In
addition, MySQL differs from Oracle's product because it's in the public domain.

Features of MYSQL

Scalability and Flexibility

The MySQL database server provides the ultimate in scalability, sporting the capacity to handle
deeply embedded applications with a footprint of only 1MB to running massive data warehouses
holding terabytes of information. Platform flexibility is a stalwart feature of MySQL with all
flavors of Linux, UNIX, and Windows being supported. And, of course, the open source nature
of MySQL allows complete customization for those wanting to add unique requirements to the
database server.

High Performance

A unique storage-engine architecture allows database professionals to configure the MySQL


database server specifically for particular applications, with the end result being amazing
performance results. Whether the intended application is a high-speed transactional processing

[Type text]
system or a high-volume web site that services a billion queries a day, MySQL can meet the
most demanding performance expectations of any system. With high-speed load utilities,
distinctive memory caches, full text indexes, and other performance-enhancing mechanisms,
MySQL offers all the right ammunition for today's critical business systems.

High Availability

Rock-solid reliability and constant availability are hallmarks of MySQL, with customers relying
on MySQL to guarantee around-the-clock uptime. MySQL offers a variety of high-availability
options from high-speed master/slave replication configurations, to specialized Cluster servers
offering instant failover, to third party vendors offering unique high-availability solutions for the
MySQL database server.

Robust Transactional Support

MySQL offers one of the most powerful transactional database engines on the market. Features
include complete ACID (atomic, consistent, isolated, durable) transaction support, unlimited
row-level locking, distributed transaction capability, and multi-version transaction support where
readers never block writers and vice-versa. Full data integrity is also assured through server-
enforced referential integrity, specialized transaction isolation levels, and instant deadlock
detection.

Web and Data Warehouse Strengths

MySQL is the de-facto standards for high-traffic web sites because of its high-performance
query engine, tremendously fast data insert capability, and strong support for specialized web
functions like fast full text searches. These same strengths also apply to data warehousing
environments where MySQL scales up into the terabyte range for either single servers or scale-
out architectures. Other features like main memory tables, B-tree and hash indexes, and
compressed archive tables that reduce storage requirements by up to eighty-percent make
MySQL a strong standout for both web and business intelligence applications.

[Type text]
Strong Data Protection

Because guarding the data assets of corporations is the number one job of database professionals,
MySQL offers exceptional security features that ensure absolute data protection. In terms of
database authentication, MySQL provides powerful mechanisms for ensuring only authorized
users have entry to the database server, with the ability to block users down to the client machine
level being possible. SSH and SSL support are also provided to ensure safe and secure
connections. A granular object privilege framework is present so that users only see the data they
should, and powerful data encryption and decryption functions ensure that sensitive data is
protected from unauthorized viewing. Finally, backup and recovery utilities provided through
MySQL and third party software vendors allow for complete logical and physical backup as well
as full and point-in-time recovery.

Comprehensive Application Development

One of the reasons MySQL is the world's most popular open source database is that it provides
comprehensive support for every application development need. Within the database, support
can be found for stored procedures, triggers, functions, views, cursors, ANSI-standard SQL, and
more. For embedded applications, plug-in libraries are available to embed MySQL database
support into nearly any application. MySQL also provides connectors and drivers (ODBC,
JDBC, etc.) that allow all forms of applications to make use of MySQL as a preferred data
management server. It doesn't matter if it's PHP, Perl, Java, Visual Basic, or .NET, MySQL
offers application developers everything they need to be successful in building database-driven
information systems.

Management Ease

MySQL offers exceptional quick-start capability with the average time from software download
to installation completion being less than fifteen minutes. This rule holds true whether the
platform is Microsoft Windows, Linux, Macintosh, or UNIX. Once installed, self-management
features like automatic space expansion, auto-restart, and dynamic configuration changes take
much of the burden off already overworked database administrators. MySQL also provides a
complete suite of graphical management and migration tools that allow a DBA to manage,

[Type text]
troubleshoot, and control the operation of many MySQL servers from a single workstation.
Many third party software vendor tools are also available for MySQL that handle tasks ranging
from data design and ETL, to complete database administration, job management, and
performance monitoring.

Open Source Freedom and 24 x 7 Support

Many corporations are hesitant to fully commit to open source software because they believe
they can't get the type of support or professional service safety nets they currently rely on with
proprietary software to ensure the overall success of their key applications. The questions of
indemnification come up often as well. These worries can be put to rest with MySQL as
complete around-the-clock support as well as indemnification is available through MySQL
Enterprise. MySQL is not a typical open source project as all the software is owned and
supported by Oracle, and because of this, a unique cost and support model are available that
provides a unique combination of open source freedom and trusted software with support.

Lowest Total Cost of Ownership

By migrating current database-drive applications to MySQL, or using MySQL for new


development projects, corporations are realizing cost savings that many times stretch into seven
figures. Accomplished through the use of the MySQL database server and scale-out architectures
that utilize low-cost commodity hardware, corporations are finding that they can achieve
amazing levels of scalability and performance, all at a cost that is far less than those offered by
proprietary and scale-up software vendors. In addition, the reliability and easy maintainability of
MySQL means that database administrators don't waste time troubleshooting performance or
downtime issues, but instead can concentrate on making a positive impact on higher level tasks
that involve the business side of data.

[Type text]
Chapter IX

System Testing

Software Testing

Software testing is an investigation conducted to provide stakeholders with information about


the quality of the product or service under test. Software testing can also provide an objective,
independent view of the software to allow the business to appreciate and understand the risks of
software implementation. Test techniques include, but are not limited to the process of executing
a program or application with the intent of finding software bugs (errors or other defects).The
purpose of testing is to discover errors. Testing is the process of trying to discover every
conceivable fault or weakness in a work product. It provides a way to check the functionality of
components, sub-assemblies, assemblies and/or a finished product It is the process of exercising
software with the intent of ensuring that the software system meets its requirements and user
expectations and does not fail in an unacceptable manner. There are various types of test. Each
test type addresses a specific testing requirement.

Software testing is the process of evaluation a software item to detect differences between given
input and expected output. Also to assess the feature of a software item. Testing assesses the
quality of the product. Software testing is a process that should be done during the development
process. In other words software testing is a verification and validation process.

Types of testing

There are different levels during the process of Testing .Levels of testing include the different
methodologies that can be used while conducting Software Testing. Following are the main
levels of Software Testing:

 Functional Testing.

 Non-Functional Testing.

[Type text]
Steps Description

I The determination of the functionality that the intended application is meant to


perform.

II The creation of test data based on the specifications of the application.

III The output based on the test data and the specifications of the application.

IV The writing of Test Scenarios and the execution of test cases.

V The comparison of actual and expected results based on the executed test cases.

Functional Testing

Functional Testing of the software is conducted on a complete, integrated system to evaluate the
system's compliance with its specified requirements. There are five steps that are involved when
testing an application for functionality.

An effective testing practice will see the above steps applied to the testing policies of every
organization and hence it will make sure that the organization maintains the strictest of standards
when it comes to software quality.

Unit Testing

This type of testing is performed by the developers before the setup is handed over to the testing
team to formally execute the test cases. Unit testing is performed by the respective developers on
the individual units of source code assigned areas. The developers use test data that is separate
from the test data of the quality assurance team. The goal of unit testing is to isolate each part of
the program and show that individual parts are correct in terms of requirements and
functionality.

Limitations of Unit Testing

Testing cannot catch each and every bug in an application. It is impossible to evaluate every
execution path in every software application. The same is the case with unit testing.

[Type text]
There is a limit to the number of scenarios and test data that the developer can use to verify the
source code. So after he has exhausted all options there is no choice but to stop unit testing and
merge the code segment with other units.

Integration Testing

The testing of combined parts of an application to determine if they function correctly together is

Integration testing. There are two methods of doing Integration Testing Bottom-up Integration
testing and Top- Down Integration testing.

S.N. Integration Testing Method

1 Bottom-up integration
This testing begins with unit testing, followed by tests of progressively higher-
level combinations of units called modules or builds.

2 Top-Down integration
This testing, the highest-level modules are tested first and progressively lower-
level modules are tested after that.

In a comprehensive software development environment, bottom-up testing is usually done first,


followed by top-down testing. The process concludes with multiple tests of the complete
application, preferably in scenarios designed to mimic those it will encounter in customers'
computers, systems and network.

System Testing

This is the next level in the testing and tests the system as a whole. Once all the components are
integrated, the application as a whole is tested rigorously to see that it meets Quality Standards.
This type of testing is performed by a specialized testing team. System testing is so important
because of the following reasons:

 System Testing is the first step in the Software Development Life Cycle, where the
application is tested as a whole.

 The application is tested thoroughly to verify that it meets the functional and technical
specifications.
[Type text]
 The application is tested in an environment which is very close to the production
environment where the application will be deployed.

 System Testing enables us to test, verify and validate both the business requirements as
well as the Applications Architecture.

Regression Testing

Whenever a change in a software application is made it is quite possible that other areas within
the application have been affected by this change. To verify that a fixed bug hasn't resulted in
another functionality or business rule violation is Regression testing. The intent of Regression
testing is to ensure that a change, such as a bug fix did not result in another fault being uncovered
in the application. Regression testing is so important because of the following reasons:

 Minimize the gaps in testing when an application with changes made has to be tested.

 Testing the new changes to verify that the change made did not affect any other area of
the application.

 Mitigates Risks when regression testing is performed on the application.

 Test coverage is increased without compromising timelines.

 Increase speed to market the product.

Acceptance Testing

This is arguably the most importance type of testing as it is conducted by the Quality Assurance
Team who will gauge whether the application meets the intended specifications and satisfies the
client requirements. The QA team will have a set of pre written scenarios and Test Cases that
will be used to test the application.

More ideas will be shared about the application and more tests can be performed on it to gauge
its accuracy and the reasons why the project was initiated. Acceptance tests are not only intended
to point out simple spelling mistakes, cosmetic errors or Interface gaps, but also to point out any
bugs in the application that will result in system crashers or major errors in the application.

[Type text]
By performing acceptance tests on an application the testing team will deduce how the
application will perform in production. There are also legal and contractual requirements for
acceptance of the system.

Alpha Testing

This test is the first stage of testing and will be performed amongst the teams (developer and QA
teams). Unit testing, integration testing and system testing when combined are known as alpha
testing. During this phase, the following will be tested in the application:

 Spelling Mistakes

 Broken Links

 Cloudy Directions

 The Application will be tested on machines with the lowest specification to test loading
times and any latency problems.

Beta Testing

This test is performed after Alpha testing has been successfully performed. In beta testing a
sample of the intended audience tests the application. Beta testing is also known as pre-release
testing. Beta test versions of software are ideally distributed to a wide audience on the Web,
partly to give the program a "real-world" test and partly to provide a preview of the next release.
In this phase the audience will be testing the following:

 Users will install, run the application and send their feedback to the project team.

 Typographical errors, confusing application flow, and even crashes.

 Getting the feedback, the project team can fix the problems before releasing the software
to the actual users.

 The more issues you fix that solve real user problems, the higher the quality of your
application will be.

 Having a higher-quality application when you release to the general public will increase
customer satisfaction.

[Type text]
Non-Functional Testing

This section is based upon the testing of the application from its non-functional attributes. Non-
functional testing of Software involves testing the Software from the requirements which are
nonfunctional in nature related but important a well such as performance, security, and user
interface etc. Some of the important and commonly used non-functional testing types are
mentioned as follows:

Performance Testing

It is mostly used to identify any bottlenecks or performance issues rather than finding the bugs in
software. There are different causes which contribute in lowering the performance of software:

 Network delay.

 Client side processing.

 Database transaction processing.

 Load balancing between servers.

 Data rendering.

Performance testing is considered as one of the important and mandatory testing type in terms of
following aspects:

 Speed (i.e. Response Time, data rendering and accessing)

 Capacity

 Stability

 Scalability

It can be either qualitative or quantitative testing activity and can be divided into different sub
types such as Load testing and Stress testing.

Load Testing

A process of testing the behavior of the Software by applying maximum load in terms of
Software accessing and manipulating large input data. It can be done at both normal and peak
load conditions. This type of testing identifies the maximum capacity of Software and its
behavior at peak time. Most of the time, Load testing is performed with the help of automated
[Type text]
tools such as Load Runner, App Loader, IBM Rational Performance Tester, Apache J Meter, Silk
Performer, Visual Studio Load Test etc. Virtual users (V Users) are defined in the automated
testing tool and the script is executed to verify the Load testing for the Software. The quantity of
users can be increased or decreased concurrently or incrementally based upon the requirements.

Stress Testing

This testing type includes the testing of Software behavior under abnormal conditions. Taking

away the resources, applying load beyond the actual load limit is Stress testing.

The main intent is to test the Software by applying the load to the system and taking over the
resources used by the Software to identify the breaking point. This testing can be performed by
testing different scenarios such as:

 Shutdown or restart of Network ports randomly.

 Turning the database on or off.

 Running different processes that consume resources such as CPU, Memory, server etc.

Usability Testing

This section includes different concepts and definitions of Usability testing from Software point
of view. It is a black box technique and is used to identify any error(s) and improvements in the
Software by observing the users through their usage and operation.

According to Nielsen, Usability can be defined in terms of five factors i.e. Efficiency of use,
Learn-ability, Memor-ability, Errors/safety, satisfaction. According to him the usability of the
product will be good and the system is usable if it possesses the above factors.

Nigel Bevan and Macleod considered that Usability is the quality requirement which can be
measured as the outcome of interactions with a computer system. This requirement can be
fulfilled and the end user will be satisfied if the intended goals are achieved effectively with the
use of proper resources.

Molich in 2000 stated that user friendly system should fulfill the following five goals i.e. Easy to
Learn, Easy to Remember, Efficient to Use, Satisfactory to Use and Easy to Understand.

[Type text]
In addition to different definitions of usability, there are some standards and quality models and
methods which define the usability in the form of attributes and sub attributes such as ISO-9126,
ISO-9241-11, ISO-13407 and IEEE std.610.12 etc.

UI vs. Usability Testing

UI testing involves the testing of Graphical User Interface of the Software. This testing ensures

that the GUI should be according to requirements in terms of color, alignment, size and other
properties.

On the other hand Usability testing ensures that a good and user friendly GUI is designed and is
easy to use for the end user. UI testing can be considered as a sub part of Usability testing.

Security Testing

Security testing involves the testing of Software in order to identify any flaws ad gaps from
security and vulnerability point of view. Following are the main aspects which Security testing
should ensure:

 Confidentiality.

 Integrity.

 Authentication.

 Availability.

 Authorization.

 Non-repudiation.

Portability Testing

Portability testing includes the testing of Software with intend that it should be re-useable and
can be moved from another Software as well. Following are the strategies that can be used for
Portability testing.

 Transferred installed Software from one computer to another.

 Building executable (.exe) to run the Software on different platforms.

[Type text]
Portability testing can be considered as one of the sub parts of System testing, as this testing type
includes the overall testing of Software with respect to its usage over different environments.

[Type text]
Chapter X

Conclusion

This paper presented a method for online handwritten English cursive word recognition using
segmentation-free MRF model. We restricted the search paths from the trie lexicon of words and
preceding paths, as well as the lengths of feature points during path search by the character-
synchronous beam search strategy. We combined it with a P2DBMN MQDF recognizer being
widely used for Chinese and Japanese character recognition.

Future Work

Experimental results demonstrate a significant improvement in recognition accuracy by


combining our segmentation-free MRF model with a P2DBMN-MQDF recognizer, and by
restricting the searched paths considering the length range. We have already shown that
combining MRF model with P2DBMN-MQDF achieves very high recognition rate for online
handwritten Japanese text . we have shown this combination is also very effective and successful
for online English word recognition. In future the work can be extended to a larger database and
for other scripts and scribbling.

[Type text]
Reference

[1] S. Jaeger, S. Manke, J. Reichert and A. Waibel, “Online handwriting recognition: the
Npen++ recognizer,” IJDAR,3(1), pp.69-180, 2001.

[2] M. Liwicki and H. Bunke, “HMM-based on-line recognition of handwritten whiteboard


notes,” Proc. 10th IWFHR, pp. 595-599, 2006.

[3] M. Cheriet, N. Kharma, C.-L. Liu and C. Y. Suen, “Character recognition systems”, A Guide
for Students and Practioners, John Wiley & Sons, Inc., Hoboken, New Jersey, 2007.

[4] S. Z. Li, Markov random field modeling in image analysis, Springer, Tokyo, 2001.

[5] J. Zeng and Z.-Q. Liu, “Markov random field-based statistical character structure modeling
for handwritten Chinese character recognition,” IEEE Trans. PAMI, 30(5), 2008.

[6] B. Zhu and M. Nakagawa, “On-line handwritten Japanese character recognition using a
MRF model with parameter optimization by CRF,” Proc. 11th ICADR, pp. 603-607, 2011.

[7] G. Seni., “Large vocabulary recognition of on-line handwritten cursive words,”PhD thesis,
Department of Computer Science of the State University of New York at Buffalo, N.Y., USA,
1995.

[8] G. Kim and V. Govindaraju, “A lexicon driven approach to handwritten word recognition for
real-time applications,” IEEE Trans. PAMI, 19(4), pp. 366-379, 1997.

[9] C.-L. Liu, M. Koga and H. Fujisawa, “Lexicon-driven segmentation and recognition of
handwritten character strings for Japanese address reading,” IEEE Trans. PAMI, 24(11), pp.
1425-1437, 2002.

[10] M. Liwicki, A. Graves, S. Fern´andez, H. Bunke, and J. Schmidhuber, “A novel approach


to on-line handwriting recognition based on bidirectional long short-term memory networks,”In
Proc. 9th ICDAR, pp.367-371, 2007.

[11] C.-L. Liu and X.-D. Zhou, “Online Japanese character recognition using trajectory-based
normalization and direction feature extraction,” Proc. 10th IWFHR, pp.217-222, 2006.

[Type text]
[12] C.-L. Liu and K. Marukawa, “Pseudo two-dimensional shape normalization methods for
handwritten Chinese character recognition,” Pattern Recognition, 38(12), pp. 2242-2255, 2005.

[13] F. Kimura, “Modified duadratic discriminant function and the application to Chinese
characters,” IEEE Trans.PAMI, 9 (1), pp.149-153, 1987.

[14] U. Ramer, “An iterative procedure for the polygonal approximation of plan closed curves,”
Computer Graphics and Image Processing, 1(3), pp. 244-256, 1972.

[15] I. Guyon, L. Schomaker, R. Plamondon,M. Liberman, and S. Janet. “Unipen project of on-
line data exchange and recognizer benchmarks,” Proc. 12 th ICPR, pp. 29–33, 1994. [16] D.-H.
Wang, C.-L. Liu, and X.-D. Zhou, “An approach for real-time recognition of online Chinese
handwritten sentences,” Pattern Recognition, (45), pp. 3661-3675, 2012.

[17] B. Zhu, J. Gao and M. Nakagawa, “Objective function design for MCEbased combination
of on-line and off-line character recognizers for online handwritten Japanese text recognition,”
Proc. 11th ICDAR, pp.594-599, 2011.

[Type text]

Vous aimerez peut-être aussi