Vous êtes sur la page 1sur 9

Major Project Report

On

Real Time Form Filling

Department of Computer Science & Engineering Technology


National Institute of Technology Karnataka, Surathkal.

Date: 31/10/2017

Group Members:
Name Roll No Email
Amit Anand 14CO102 Amitanand2307@gmail.com
Pashupati Nath Verma 14CO130 Verma2471995@gmail.com
Ravindra K Jonwal 14CO135 Ravindra.jonwal007@gmail.co
m
TABLE OF CONTENTS

1. Abstract

2. Introduction

3. Implementation

3.1 Implementation of proposed model

3.2 Algorithm (Speech to text and text to speech)

3.3 Box Detection Algorithm (Contour Approximation Algorithm)

4. Result

5. Conclusion and Future Work

6. References
1. Abstract

In Major Project we proposed a framework model of real-time form lling. A framework model
of form lling aims to reduce the amount of time a user spends lling out a form by providing
information of the values of elds on the form and using this information form will be lled
accurately. Filling out the form for different services is very difcult task for the users such as
illiterates and visually impaired person. In this project we implemented a solution for real-time
form lling. Existing auto-lling approaches do not exploit all of the available resources useful
to facilitate the auto-lling task. In this project, we propose an intelligent auto-lling
framework which propagates users inputs across different process such as converting the form
into image then image to text format and with the help of text to speech API which spell the
text of form (question of form to be ll) in human understandable language. With all this
available information user will answer the questions of the form which detected as a text and
automatically lled at the appropriate place in the soft format of form which can be saved as
pdf or text format.

2. Introduction

There are lot of application forms where users are required to ll the form manually .There are
different kind of users depend on the differences between language, literacy. Some users know
the language in speaking but not in writing so it is problem for them to ll the form. So there
should be some platform on which they can remove this problem. The problem we address here
is common to filling the application form through speech automatically .In fact there is a lot of
data a person can gather from lling the form. However there is no such platform till date which
provides this facility for lling the paper forms through speech. So our project aims at lling
the form through text to speech and speech to text conversion. There are different kinds of
inputs to the form like checkboxes, boxes, pictures which still need to be implemented to make
it appropriate for the general forms. For now we are trying to make our model work on the
specic forms which have specic elds to be lled. Our approach to solve this problem of
lling the form consists of capturing the photo of the form using a high quality camera (which
is recognized by the OCR (Optical Char Recognition)). To optimize it further, a digit
recognizer is also develop to recognize date of birth, age and other digital parameters
efciently. Then the converted text to image is done so that users can understand what is there
in the form and accordingly he will give the responses which can be lled at a particular eld
in the form.
3. Implementation
We have built an application which can take the image of the form and converts it to text
format. Then asks the questions in form in human understandable language (English).It then
reads the answers spoken by user and converts them to text and print the form with questions
along with answers. Our current application only works with English language.

3.1 Implementation of proposed model


We tried implementing the model from scratch rst and successfully coded a image to digit
converter using neural network with 784 input neurons, 100 hidden layer neuron and 10
output neuron. But that gave us a maximum accuracy of 96.42% on the MNIST data set. So
keeping in mind the timeframe of our project and the complexity of it, we decided to use
Python libraries. For image-to-text conversion, we have used pytesseract library, For text-to-
speech conversion we have used pyttsx library and for speech-to-text we have used speech
recognition library and for the detection of boxes and spaces where to actually fill the text
using opencv2 library in python with the help of Contour Approximation algorithm.

3.2 Algorithm (Speech to text and text to speech)


3.3 Box Detection Algorithm (Contour Approximation Algorithm)

It approximates a contour shape to another shape with less number of vertices depending upon
the precision we specify. It is an implementation of Douglas-Peucker algorithm. The Ramer
DouglasPeucker algorithm, also known as the DouglasPeucker algorithm and iterative
end-point fit algorithm, takes a curve composed of line segments and finds a similar curve
with fewer points. The purpose of the algorithm is, given a curve composed of line
segments (which is also called a Polyline in some contexts), to find a similar curve with fewer
points. The algorithm defines 'dissimilar' based on the maximum distance between the original
curve and the simplified curve (i.e., the Hausdorff distance between the curves). The simplified
curve consists of a subset of the points that defined the original curve.

The starting curve is an ordered set of points or lines and the distance dimension > 0.
The algorithm recursively divides the line. Initially it is given all the points between the first
and last point. It automatically marks the first and last point to be kept. It then finds the point
that is furthest from the line segment with the first and last points as end points; this point is
obviously furthest on the curve from the approximating line segment between the end points.
If the point is closer than to the line segment, then any points not currently marked to be kept
can be discarded without the simplified curve being worse than .
If the point furthest from the line segment is greater than from the approximation then that
point must be kept. The algorithm recursively calls itself with the first point and the furthest
point and then with the furthest point and the last point, which includes the furthest point being
marked as kept.
When the recursion is completed a new output curve can be generated consisting of all and
only those points that have been marked as kept.
Given our approximated contour now we can detect shape of boxes like circle, square and
rectangle.

If len (approx.) ==3


Then shape=triangle;
Elif len (approx.) ==4
Then shape= square if ar>=0.95 and ar<=1.05 else rectangle where ar=width/float
(height)
Else shape= Circle
3.4 Flowchart
4. RESULTS
We have successfully implemented the form lling by speech application for a very limited
Domain of forms. Our application works for forms which dont have many boxes or tables and
the major part of them is text. Right now it only works for English language. Working of whole
process is illustrated below:

From below gure it is clear that we have taken image of the form as input as shown in (a) then
it is converted into a text le in (b) then using speech recognition libraries we have convert
given text to speech using several constraints on it(c).Now user will speak and that speech will
be converted into text(d).Finally we get our output form lled with the details(Fig2).

1. Detection of Boxes

Input Form in jpg format output of Detected boxes


2. Filling Of Application Form

5. CONCLUSION AND FUTURE WORK


In this project we present an application framework which can ll the details of the form
through speech recognition. We would like to expand our application to work for a more wide
range of forms and give the users to select language of their choice to ll the form. Our current
application doesnt work well with boxes and tables in the form, we would like to improve on
that. We will also try to improve the speech recognition ability through some machine learning
and deep learning techniques.

6. Reference

[1] An Intelligent Framework for Auto-lling Web Forms from Dierent Web Application (Ying
Zou, ImanKeivanloo, BipinUpahyaya).
[2] Predictive Models of Form Filling. Alnur Ali (Microsoft Corporation Redmond, WA
98052 alnurali@microsoft.com) and Christopher Meek (Microsoft Research Redmond, WA
98052 meek@microsoft.com).
[3] Automatic Form Filling on Mobile Devices. Enrico Rukzio1, Chie Noda2, Alexander De
Luca3, John Hamard4, Fatih Coskun3.
[4] T. Chusho, K. Fujiwara and K. Minamitani, Automatic Filling in a Form by an Agent for
Web Applications, Asia-Pacic Software Engineering Conference 2002, IEEE Computer
Society, pp.239-247, 2002.
[5] iOpus Internet Macros, http://www.iopus.com
[6] A Probabilistic Approach for Automatically Filling FormBased Web Interfaces
[7] SpeechForms: FromWeb to Speech and Back by Luciano Barbosa, DiamantinoCaseiro,
Giuseppe Di Fabbrizio, Amanda Stent
[8] DIRECTORY RETRIEVAL USING VOICE FORM-FILLING S. Parthasarathy AT&T
Labs Research, Florham Park, NJ, USA A. MorenoDaniel Center for Signal and Image
Processing Georgia Institute of Technology, Atlanta, GA, USA

Vous aimerez peut-être aussi