Vous êtes sur la page 1sur 80

References

AUTO-TEXT SUMMARIZER

Developed by
Atia Akram [1841-FBAS/BSCS/F09]
Sara Waheed [1842-FBAS/BSCS/F09]

Supervised by
Ms. Maria Ashraf

Department of Computer Science and Software Engineering


Faculty of Basic and Applied Sciences
International Islamic University, Islamabad
Auto-Text Summarizer

Page 1

References

2013

Believe in Gods Wisdom


&
Believe in His Words
For
Auto-Text Summarizer

Page 2

References

No Prayer is Unanswered
&
No Prayer is Unheard

Auto-Text Summarizer

Page 3

Final Approval

Final Approval

Dated: _________

It is certified that we have read the project report submitted by Atia Akram(1841FBAS/BSCS/F09) and Sara Waheed (1842-FBAS/BSCS/F09) and it is our judgment that this
project is of sufficient standard to warrant its acceptance by the International Islamic University,
Islamabad for the BS Degree in Computer Science.
Committee
____________________

External Examiner

Ms. Talat Ambreen


Lecturer,
International Islamic University, Islamabad.

Internal Examiner

____________________

Ms. Saleha
Lecturer,
International Islamic University, Islamabad.

Supervisor
Ms. Maria Ashraf

____________________

Lecturer,
International Islamic University, Islamabad.

Auto-Text Summarizer

Page 1

Dedication

DEDICATION
We dedicate our project to
Our Dear parents,
Family,
Respected teachers,
and especially to Our Department of Computer Science
And all those who prayed, supported and encouraged us to achieve our
goals.

Auto-Text Summarizer

Page 2

Dissertation

A dissertation submitted to the


Department of Computer Science and Software Engineering,
International Islamic University, Islamabad
as a partial fulfillment of the requirements
for the award of the degree of
BS of Computer Science

Auto-Text Summarizer

Page 3

Declaration

DECLARATION
We hereby declare that this Software, neither as a whole nor as a part thereof has been copied out
from any source. It is further declared that we have developed this software entirely on the basis
of our personal efforts made under the sincere guidance of our teachers and supervisor.
No portion of the work presented in this report has been submitted in support of any application
for any other degree or qualification of this or any other university or institute of learning.

Atia Akram
(1841-FBAS/BSCS/F09)
Sara Waheed
(1842-FBAS/BSCS/F09)

Auto-Text Summarizer

Page 4

Acknowledgement

ACKNOWLEDGEMENT
First of all, the praise to Allah Almighty whose guidance let us to follow none but the right path
and accomplish the task in limited time. Our prestigious Institution IIUI (International Islamic
University Islamabad), which endowed us with another opportunity to prove ourselves being
capable of doing such a significant task.
Our supervisor, Ms Maria Ashraf whose friendly attitude and noteworthy supervision enabled
us to work in harmony and in a peaceful environment. Our families always supported us in the
hour of need. Thanks to all other friends and colleagues, whose untiring efforts made our project
not a hypothetical but a practical one.

Auto-Text Summarizer

Page 5

Project in Brief

PROJECT IN BRIEF
Project Title:

Auto-Text Summarizer

Objective:

The main objective of project is to generate summary of digital


documents.

Undertaken By:

Atia Akram
Sara Waheed

Project Idea By:

Ms. Maria Ashraf

Supervised By:

Ms. Maria Ashraf


Lecturer

Date Started:

February,2013

Date Completed:

August, 2013

Tools Used:

Microsoft Visual Studio 2010

System used

Windows 7, Core i3 CPU, Processor 2.40 GHz

Auto-Text Summarizer

Page 6

Abbreviations

ABBREVIATIONS
NLP

Natural Language Processing

LSA

Latent Semantic Analysis

SVD

Singular Value Decomposition

HTML

Hypertext Markup Language

URL

Uniform Resource Locator

DFD

Data Flow Diagram

Abstract

ABSTRACT
In NLP, auto-text summarization is the process of summarizing large documents extracting most
important points of documents using computers computational power. Generating summary of
text document with all the important points in a document is one of the most critical elements to
Auto-text summarizers.
This system generates summary of electronic documents retaining important points of a
document. Sentence Selection and Cross Method algorithms are used to calculate priority of
sentences and text summarization. The system replaces the strong vocabulary words in a
summary with their synonyms by maintaining a dictionary.

Table of Contents

Contents
1.

Introduction...................................................................................................................................................1

1.2. Literature Review...........................................................................................................................................2


1.3. Problem Statement.........................................................................................................................................2
1.6. Proposed System.............................................................................................................................................4
1.8. Outcome Of Work...........................................................................................................................................5
1.9. Methodology Used..........................................................................................................................................5
1.10. Algorithms.....................................................................................................................................................5
1.11. Tools................................................................................................................................................................5
1.11.1. Microsoft Visual Studio 2010....................................................................................................................5
1.11.2. Extreme Optimization Numerical Libraries For .Net............................................................................6
1.11.3. Microsoft Word Thesaurus.......................................................................................................................6
1.12. Language Used..............................................................................................................................................6
2. Requirement Analysis........................................................................................................................................8
2.1. Requirement....................................................................................................................................................8
2.2. Overall Description Of Auto-Text Summarizer...........................................................................................8
2.2.1. Product Perspective.....................................................................................................................................9
2.2.1.1. System Interface........................................................................................................................................9
2.2.1.2. Hardware Interface..................................................................................................................................9
2.2.1.3. Software Interface....................................................................................................................................9
2.2.1.4. Operational Environment........................................................................................................................9
2.2.1.5. Environmental Requirements................................................................................................................10
2.3. Product Functions.........................................................................................................................................10
2.3.1. Text Extraction...........................................................................................................................................10
2.3.2. Text Reduction...........................................................................................................................................10
2.3.3. Replacement Of Words With Synonyms.................................................................................................10
2.5. Supplementary Requirements.....................................................................................................................11

Table of Contents
2.5.1. System Reliability......................................................................................................................................12
2.5.2. System Usability.........................................................................................................................................12
2.5.3. Efficiency....................................................................................................................................................12
3. System Design...................................................................................................................................................14
3.1.1. Input............................................................................................................................................................14
3.1.2. Text Pre Processing....................................................................................................................................14
3.1.3. Sentence Separator....................................................................................................................................14
3.1.4. Replacement Of Special Characters With Spaces..................................................................................15
3.1.5. List Of Words.............................................................................................................................................15
3.1.6. Stop Words Eliminator..............................................................................................................................15
3.1.7. List Of Unique Words...............................................................................................................................15
3.1.8. Latent Semantic Analysis..........................................................................................................................15
3.1.8.1. Input Matrix Creation............................................................................................................................16
3.1.8.2. Svd Value.................................................................................................................................................16
3.1.8.3. Sentence Selection...................................................................................................................................16
3.1.9. Text Summarization...................................................................................................................................16
3.1.9.1. Cross Method..........................................................................................................................................16
3.1.10. Saving Extracted Summary....................................................................................................................17
3.2. Data Flow Diagram.......................................................................................................................................18
3.2.1. Process........................................................................................................................................................18
3.2.2. Context Level Dfd For Auto-Text Summarizer.......................................................................................19
3.2.3. Level 1 Dfd For Auto-Text Summarizer..................................................................................................20
3.2.4. Level 2 Dfd For Auto-Text Summarizer..................................................................................................21
3.2.5. Level 2 Dfd For Auto-Text Summarizer..................................................................................................22
3.3. Data Flow Diagram For Auto-Text Summarizer.......................................................................................22
3.3.1. Flow Chart Components...........................................................................................................................23
3.3.3. Calculate Sentence Priority......................................................................................................................26
3.3.4. Sentence Selection......................................................................................................................................26

Table of Contents
3.3.5. Word Replacement....................................................................................................................................26
4. Implementation................................................................................................................................................29
4.2. Extreme Optimization Numerical Libraries For .Net...............................................................................29
5. Testing...............................................................................................................................................................33
5.1. Testing Of Plan..............................................................................................................................................33
5.2. Test Case 01...................................................................................................................................................34
5.3. Test Case 02...................................................................................................................................................35
5.4. Test Case 03...................................................................................................................................................36
5.5. Test Case 04...................................................................................................................................................37
5.6. Test Case 05...................................................................................................................................................38
5.7. Test Case 06...................................................................................................................................................39
5.8. Test Case 07...................................................................................................................................................40
5.9. Test Case 08...................................................................................................................................................41
5.10. Test Case 09.................................................................................................................................................42
6. Conclusion........................................................................................................................................................42
6.1. Good Features...............................................................................................................................................42
6.2. Future Enhancements..................................................................................................................................43
User Manual............................................................................................................................................................48
Working Of System..............................................................................................................................................48
Path.......................................................................................................................................................................60
Export Summary.................................................................................................................................................62
References................................................................................................................................................................65

Table of Contents

Table of Tables
Table 1.1. Problem of Unavailability of summarizers with word replacement feature..3
Table 3.1. Product Positioning Statement.11
Table 5.1. Test Case for Browse Word file...31
Table 5.2. Test Case for Extract Text from Word file.32
Table 5.3. Test Case for URL input..33
Table 5.4. Test case for Extract Text from URL..34
Table 5.5. Test Case for Reading of User Text.35
Table 5.6. Test Case for Error Message...36
Table 5.7. Test Case for Error Message...37
Table 5.8. Test Case for Summary Generation.38
Table 5.9. Test Case for Synonym Replacement..39

Table of Contents

Table of Figures
Figure 3.1. Example of Cross Method Approach.17
Figure 3.2. Context DFD for Auto-Text Summarizer...19
Figure 3.3 level 1 DFD for Summary Generation and Synonym Replacement...........20
Figure 3.4. Level 2 DFD of Internal Working......21
Figure 3.5. DFD for Synonym Replacement....22
Figure 3.6. Flow Chart of Auto-Text Summarizer....24

CHAPTER 1
INTRODUCTION

1. INTRODUCTION
One of the important NLP applications is Text Summarization, which helps users to manage the
vast amount of information available, by condensing documents' content and extracting the most

relevant facts or topics included[1]. Summarizing a text means giving a brief statement about the
main points of some paragraph or document and dispensing needless details or formalities.
Before going deep into the system, we will first describe the need of a tool capable of
summarizing large amount of data retaining the main points and concepts of document. In
todays Internet age, the information is overwhelmed. The internet stores large amount of data
about everything. One may not want to go through from an entire page to get the information of
his need. Summarizing a particular document may help a user to get the information of his
interest from a huge amount of data.
Summaries can be of forms

Extractive summaries
Abstractive summaries

Extractive summaries take important sentences from original document to generate summary.
Abstractive summaries do not take sentences from original document rather capture important
concepts in the original document and generate new sentences. Abstractive summarization
approach is similar to the way that human summarizers follow.
Summarization methods are categorized according to what they generate and how they generate.

Single document summarization


Multi-document summarization

A summary can be generated from a single document, known as single document summarization
and multiple documents, called multi-document summarization.
Summaries can also be categorized as

Generic summaries
Query-Based summaries

Generic summarization systems generate summaries containing main topics of documents. In


query-based summarization, the generated summaries contain the sentences that are related to the
given queries[2].

The summarization tools provide the facility to a user to give a document to the tool to generate a
summary of a document so that the user may save his time by not reading the entire large
document but picking up the concept described in the document through the generated summary.
This proposed project is a summarization tool which is a desktop application that comforts the
user eliminating the need of reading large documents. The tool works by taking a single
document from user, extracting the important sentences of document and providing a summary
to user. In the proposed project, the generated summary is made more readable by replacing the
words in the summary with their synonyms. The system replaces those words in the summary
with their meanings which are also present in dictionary. Dictionary is maintained by the system.

1.2. LITERATURE REVIEW


Internet has become the biggest source of information all over the world for past few years.
Every individual be it a student, business man or any other belonging to any field concern world
wide web to get information of any type. This is because the information is just a click away. As
in the internet contains information belonging to every field and the data is in such a large
amount that many times it is mixed up and becomes very difficult for the reader to pick up the
right one. One has to go through the whole document to find out the information needed. There is
no such auto summarizer which can generate the summary of digital documents with difficult
words replaced with their meanings easy to understand for users convenience.

1.3. PROBLEM STATEMENT


As the problem of information overload has grown, and as the quantity of data has increased, so
has interest in automatic summarization. All the computer users, be it professional or novices
user are particularly affected by this predicament. In this present situation, need of auto
summarizer is increased.

The problems due to which this tool, Auto-Text Summarizer has been built are described below:

The problem of

No summarizer tool facilitating people having bad

Affects

understanding of English
Computer users specifically those using for reading and
searching purpose.

the impact of which is


a successful solution
would be

People not willing to use such kind of tools.


Provide users with a tool which generate summary in a
more interactive way.

Table 1.1. Problem of Unavailability of Word Replacement Feature in Auto-Text Summarizers

1.4. DRAWBACKS IN EXISTING SYSTEM


There are number of Auto-Text Summarizers available. There are also online Text Summarizers
available.

The basic problem with the existing system is unavailability of any mechanism to

facilitate the computer user with bad understanding of English language.


Lay out is very unattractive and not appealing
Use of traditional and non user friendly interfaces that are hard to use

1.5. USER REQUIREMENTS


The users want a tool that is not only beneficial for the users having command on English
language but also helpful for the computer users with poor vocabulary of English. User wants

An easy to use interface where user can get desired output in a relatively easier way.
A system where a user can get a summary of a document or text in a customized

manner. (length of summary defined by user)


A system where a user can get an easy to understand summary of a document with
simple English vocabulary.

1.6. PROPOSED SYSTEM


Auto Text Summarizer will help computer user get rid of this burden of reading each and every
word of large documents to find the material of his interest. Our system will provide the user
facility to generate the summary of large documents. The tool operates on a single document and
provides a summary of the document. The default length of summary is fixed but the user can
increase or decrease the number of lines of summary which will be beneficial for him in regard
that in how many lines the user can understand the topic and concept of document. Secondly
system will be capable of replacing the strong vocabulary words (present in dictionary
maintained by system) with their synonyms. In this project the technique will be used for a very
limited vocabulary (maximum 50 words) which can be enhanced in the future.

Features
Summarize

HTML document directly through URL.

Word document on the system.

Text in the text area provided.

Adobe document by first converting it to a word file using existing softwares.

The system only focuses on the English text in the document.

1.7. SCOPE OF NEW SYSTEM


This project is serviceable for the computer users who are required to go through from a large
amount of information frequently. This system will help them to find out the information of
interest saving time by eliminating the need of moving through an entire document. This tool
also focuses on the people incapable of understanding strong English vocabulary words. The
replacement of words (present in dictionary maintained by system) with their synonyms is
distinct attribute of this software tool.

1.8. OUTCOME OF WORK

The outcome of this system generates summary of documents of different types (html document,
word document, user specified text, adobe documents after converting it to word document using
existing softwares) with the strong vocabulary words (present in dictionary maintained by
system) replaced with their synonyms so that extracted summary is relatively easier to
understand.

1.9. METHODOLOGY USED


Following are the list of methodologies used for the system:

Text Extraction
Text Summarization
Replacement of words with synonyms

1.10. ALGORITHMS

Latent Semantic Analysis (LSA)


Cross Method Approach

1.11. TOOLS
Following are the tools which are used to build the system:

Microsoft Visual Studio 2010


Extreme Optimization Numerical Libraries for .NET
Microsoft Word Thesaurus

1.11.1. MICROSOFT VISUAL STUDIO 2010


Microsoft Visual Studio is an integrated development environment (IDE) from Microsoft. It is
used

to

develop console and graphical

user

interface applications along

with Windows

Forms or WPF applications, web sites, web applications, and web services in both code together
with managed

code for

all

platforms

supported

by Microsoft

Windows, Windows

Mobile, Windows CE, .NET Framework, .NET Compact Framework and Microsoft Silver
light[3].

Visual Studio includes a code editor supporting IntelliSense as well as code refactoring. The
integrated debugger works both as a source-level debugger and a machine-level debugger. Other
built-in tools include a forms designer for building GUI applications, designer, class designer,
and database schema designer. It accepts plug-ins that enhance the functionality at almost every
levelincluding

adding

support

for source-control systems

(like Subversion and Visual

SourceSafe) and adding new toolsets like editors and visual designers for domain or toolsets for
other aspects of the software development lifecycle .

1.11.2. EXTREME OPTIMIZATION NUMERICAL LIBRARIES FOR


.NET
The Extreme Optimization Numerical Libraries for .NET are a collection of general-purpose
mathematical and statistical classes built for the Microsoft .NET framework. It is a first
complete platform for technical and statistical computing built on and for the Microsoft .NET
platform version 4.5. 4.0 and 3.5. It combines a math library, a vector and matrix library, and a
statistics library in one convenient package.

1.11.3. MICROSOFT WORD THESAURUS


In Microsoft Office Word, you can look up a word quickly if you right click on a word and then
select Synonym from a shortcut menu. It will display multiple synonyms of a single word that
facilitates user. This feature is known as Microsoft Word Thesaurus.

1.12. LANGUAGE USED


The programming language used to develop this application is C#

CHAPTER 2
SYSTEM ANALYSIS

Chapter 3

System Analysis

2. REQUIREMENT ANALYSIS
The initial step of the software process is the requirement analysis which provides with a
representation of information, function and behavior of software.
The primary objective of the requirement analysis phase is to understand and document the
business needs and the processing requirements of the new system. Analysis is essentially a
discovery process to gain knowledge. Gathering information is a basic important part of analysis.
Studies reveal an inadequate attention to software requirement analysis. At the beginning of a
project is the most common cause for critically vulnerable projects that often do not deliver even
on the basic tasks for which they were designed.
The principal objective of the systems analysis phase is the specification of what the system
needs to do to meet the requirements of end users.
In the systems design phase system specifications are converted to a hierarchy of charts that
define the data required .The processes to be carried out on the data, so that they can be
expressed as instructions of a computer program. Many information systems are implemented
with generic software, rather than with such custom-built programs. In order to define its goals
or purposes and to discover operations and procedures for accomplishing them most efficiently.

2.1. REQUIREMENT
A requirement is a characteristic of the system or an explanation of something the system is
capable of doing in order to fulfill the systems purpose.
Requirements analysis is done in iterative manner with functional analysis to optimize
performance requirements for identified functions, and to verify that synthesized solutions can
satisfy customer requirements.

2.2. OVERALL DESCRIPTION OF AUTO-TEXT SUMMARIZER


The description is given below:

Auto-Text Summarizer

Page 8

Chapter 3

System Analysis

2.2.1. PRODUCT PERSPECTIVE


NLP is the branch of computer science focused on developing systems that allow computers to
communicate with people using everyday language[4]. It is also called Computational
Linguistics which deals with how computational methods can aid the understanding of human
language.
Summaries are an important tool for familiarizing oneself with the subject area. Text summaries
are essential when forming an opinion on if reading a whole document is necessary for our
further knowledge acquiring or not. So, a summarizer is a system whose goal is to produce a
condensed representation of the content of its input for human consumption[6].
Automatic Summarization is the process of reducing a text document electronically (using
computer programs).

2.2.1.1. SYSTEM INTERFACE


Auto Text Summarizer Tool is complete software that extracts summaries of electronic text
documents and provides synonyms of difficult words present in summary.

2.2.1.2. HARDWARE INTERFACE


All the features of this application must be able to execute on a personal computer.

2.2.1.3. SOFTWARE INTERFACE


The software interface is responsive enough and produces summary of a text document in a
minimum time.

2.2.1.4. OPERATIONAL ENVIRONMENT


The Auto-Text Summarizer requires windows 7 for running the product.

Auto-Text Summarizer

Page 9

Chapter 3

System Analysis

2.2.1.5. ENVIRONMENTAL REQUIREMENTS


No specific environmental requirements.

2.3. PRODUCT FUNCTIONS


Functions of Auto Text Summarizer are defined below:

2.3.1. TEXT EXTRACTION


Extracts the text from input document and convert it to plain text.

2.3.2. TEXT REDUCTION


Text Reduction involves summarizing input document. By default system generates summary of
a length 1/5 of length of original text. User can change the length of summary on his own.

2.3.3. REPLACEMENT OF WORDS WITH SYNONYMS


The system is fed with a dictionary to replace the difficult words in summary with their
synonyms for users convenience.

Auto-Text Summarizer

Page 10

Chapter 3

System Analysis

2.4. PRODUCT POSITIONING STATEMENT


Product positioning statement is as follows:

For
Who

Auto-Text summarization tool


Need to get an idea of complete document by just
reading few lines in simple English being most
people demand.

The

Auto-Text

Summarizer

is

an

English

text

summarization tool
That

Provides facility to generate summary of huge


documents

Unlike

in

comparatively

easier

English

language.
Traditional and non user friendly summarization
tools.

Our product

Summarize document extracting the most important


points of a document and presenting it in simple
English easy to understand.
Table 3.1. Product Positioning Statement

2.5. SUPPLEMENTARY REQUIREMENTS


The supplementary requirements are part of software system specifications which covers the
non-functional requirements that the software system need to address. Supplementary
requirements of system are:

2.5.1. SYSTEM RELIABILITY


System is reliable because it generates summary according to users requirement.

Auto-Text Summarizer

Page 11

Chapter 3

System Analysis

2.5.2. SYSTEM USABILITY


System is easy to use and is designed keeping in view all types of users. Interface is user
friendly.

2.5.3. EFFICIENCY
The application is efficient than previous available summarizers.

Auto-Text Summarizer

Page 12

CHAPTER 3
SYSTEM DESIGN

Chapter 4

System Design

3. SYSTEM DESIGN
Systems design is the process of defining the architecture, structural design, mechanism,
modules, interfaces, and data for a system to assure specific requirements. The objective of the
design phase is to design the solution system based on the requirements defined and decisions
made during analysis. Systems design could be seen as the application of systems
theory to product development. There are common characteristics with the disciplines of systems
analysis, systems architecture and engineering. It shows the designing of the system for defining
the solution of the problem.

3.1. ALGORITHM
3.1.1. INPUT
Input is the electronic document of which summary is to be generated. The input document can
be any MS Word file (*.doc or *.docx), an HTML document (web page), a plain text (user
defined) or an Adobe file (*.pdf) after converting it to Word file using existing softwares. Input is
taken by browsing a file from a personal computer.

3.1.2. TEXT PRE PROCESSING


Text pre processing is the process of converting text of different formats into plain text for
further processing. In this process text of a document is extracted and stored and passed on for
summary generation process.

3.1.3. SENTENCE SEPARATOR


Sentence separator is a step in which the sentences in the input text are separated. Full Stop (.)
is the point where sentences are separated. Regular Expression is designed to separate sentences
of input text. A list of sentences is created. Counter is set to count the number of sentences. This
counter is used to set the limit of summary.

Auto-Text Summarizer

Page 14

Chapter 4

System Design

3.1.4. REPLACEMENT OF SPECIAL CHARACTERS WITH SPACES


After the sentences have been separated, they all are passed through a condition. This condition
is designed to replace all strip characters with spaces in an input sentence. The resulting sentence
contains spaces in place of special characters.

3.1.5. LIST OF WORDS


A list of words in a document is generated. It includes all the words in a document. This list is
refined further.

3.1.6. STOP WORDS ELIMINATOR


Stop words from the extracted text are eliminated to reduce the noise and increase processing
speed. There is a huge list of stop words in English language available online. The list of
stopping words is stored and when program find these words in an input text they are eliminated.

3.1.7. LIST OF UNIQUE WORDS


After eliminating stop words from input text, the related words of a document are available for
future processing. A list of unique words is generated to reduce the overhead of processing.

3.1.8. LATENT SEMANTIC ANALYSIS


Latent Semantic Analysis (LSA) is an algebraic statistical method that extracts meaning of words
and similarity of sentences using the information about the usage of the words in the context. It
keeps information about which words are used in a sentence, while preserving information of
common words among sentences. The more common words between sentences mean that those
sentences are more semantically related. All summarization methods based on LSA technique
uses three main steps:

Input Matrix Creation


Singular Value Decomposition (SVD)
Sentence Selection

Auto-Text Summarizer

Page 15

Chapter 4

System Design

3.1.8.1. INPUT MATRIX CREATION


An input matrix is created using unique words list and sentence list. The rows constitute the
words and columns constitute the sentences. The matrix is filled with the frequency of a word in
each sentence.

3.1.8.2. SVD VALUE


SVD (Singular Value Decomposition) method models relationship among words and sentences.
It has the capability of noise reduction for improvement in accuracy. It decomposes the input
matrix into three other matrices as follows:
A= UV^T
where A is the input matrix with dimensions m x n, U is an m x n matrix which represents the
description of the original rows of the input matrix as a vector of extracted concepts, is an n x n
diagonal matrix containing scaling values sorted in descending order, and V is an m x n matrix
which represents the description of the original columns of input matrix as a vector of the
extracted concepts.

3.1.8.3. SENTENCE SELECTION


For sentence selection, the method of Number Of Occurrence is used. In this method each cell of
input matrix is filled with the frequency of word in a sentence

3.1.9. TEXT SUMMARIZATION


Text Summarization is the process of reducing a text document while preserving the important
points in a document. Different methods are proposed for summarization process. For this
system, Cross Method approach is used.

3.1.9.1. CROSS METHOD


The proposed approach preprocesses the V^T matrix before selecting the sentences. First, an
average sentence score is calculated for each concept which is represented by a row of V^T
matrix. If the value of a cell in that row is less than the calculated average score of that row, the
score in the cell is set to zero. The main idea is that there can be sentences such that they are not
Auto-Text Summarizer

Page 16

Chapter 4

System Design

the core sentences representing the topic, but they are related to the topic in some way. The
preprocessing step removes the overall effect of such sentences.
In Cross approach, first the cell values are multiplied with the values in the matrix, and the
total lengths of sentence vectors, which are represented by the columns of the V^T matrix, are
calculated. Then, the longest sentence vectors are collected as a part of the resulting summary.

Example
In figure below, an example V^T matrix is given. First, the average scores of all concepts are
calculated, and the cells whose values are less than the average value of their row are set to zero.
The boldface numbers are below row averages in figure, and they are set to zero before the
calculation of the length scores of sentences. Then, the length score of each sentence is
calculated by adding up the concept scores of sentences in the updated matrix. In the end, the
sentence sent1 is chosen for the summary as the first sentence, since it has the highest length
score.
Con0
Con1
Con2
Con3
Con4
Length

Sent0
0,557
0,345
0,732
0,628
0,557
1,846

Sent1
0,691
0,674
0,232
0,436
0,691
2,056

Sent2
0,241
0,742
0,435
0,783
0,241
1,960

Sent3
0,110
0,212
0,157
0,865
0,710
1,575

Average
0,399
0,493
0,389
0,678
0,549

Figure 3.1. Example Cross Method Approach

3.1.10. SAVING EXTRACTED SUMMARY


The extracted summary can be saved in different formats. User can store summary in a Word file
(*.docx) or in Adobe file (*.pdf). User can also select the location to save summary according to
his will.

3.2. DATA FLOW DIAGRAM

Auto-Text Summarizer

Page 17

Chapter 4

System Design

Data flow diagram shows the system processes the data. It is used to represent the relationship
between components of the program. It is a technique for modeling a scheme high-level detail by
presenting how input information is changed to productive results through a series of useful
transformations.
DFDs are easier and simpler to understand by scientific and non-technical audiences. It can
present a high level system general idea, absolute with limitations and relations to other systems.
It also grants a thorough image of system mechanism.

3.2.1. PROCESS
The process is the handling that transforms information, performing computations, building
decisions (logic flow), or directing data flows based on production rules. In other words, a
process receives participation and generates some production.
Auto-Text Summarizer also performs processes that are described in the form of DFD.
Following are some processes that show how data is transformed and computed.

Browsing input file

Extracting Text from input file

Extracting Summary

Checking Dictionary

Replacing words with synonyms

Data flow diagram of auto-text summarizer gives a simple but authoritative graphic method
which is straightforwardly understandable. It gives view point of data movements which
consists of the inputs and outputs that represents the flow of information. The ability to
represent the system at different levels of details gives added advantage.

Auto-Text Summarizer

Page 18

Chapter 4

System Design

Working starts from taking input a document or user text and end in generating summary of the
document. After summary is generated, it is checked to look for the difficult words which are
also present in the dictionary to replace those words with their meanings.

3.2.2. CONTEXT LEVEL DFD FOR AUTO-TEXT SUMMARIZER


Context level data flow diagram represents the complete system. Further working of the system
is represented in level 1 and level 2 diagrams. The Context level DFD is referred to as Level 0
DFD, it is a data flow diagram (DFD) of the range of an organizational system that shows the
system limitations, outside entities that act together with the system and the main information
flows among the entities and the system. Level 0 diagrams is the main (high) level view of a
system. It is same as Block diagram. System context diagrams help in better understanding of the
context of the system.
Figure 3.2. Context DFD for Auto-Text Summarizer

3.2.3. LEVEL 1 DFD FOR AUTO-TEXT SUMMARIZER


Auto-Text Summarizer

Page 19

Chapter 4

System Design

Figure 3.1 level 1 DFD for Summary Generation and Synonym Replacement

The system has two main modules. The first major module is to extract summary of a given input
document and the second module is the replacement of difficult words in a summary with their
synonyms which are also present in a dictionary.

3.2.4. LEVEL 2 DFD FOR AUTO-TEXT SUMMARIZER


Level 1 for system presents the breakdown of context level dfd that represents inner working
of the system. A data flow diagram is that which can be used to point to the clear progress of
a business scheme.

Auto-Text Summarizer

Page 20

Chapter 4

System Design

Figure 3.3. Level 2 DFD of Internal Working

Initially, a user browse a file. The file can be any Word file, URL of a web page or user specified
text. Then the text from the input file is extracted for further processing. A regular expression
separates sentences of extracted text. Separated sentences are stored in a list. The system
calculates the priority of sentences. Priority determines the importance of sentence. The
sentences are then selected to be added in summary.

3.2.5. LEVEL 2 DFD FOR AUTO-TEXT SUMMARIZER


This data flow diagram shows the procedure in which the words present in a dictionary if also
present in a summary, the system replaces the word with its synonym.

Auto-Text Summarizer

Page 21

Chapter 4

System Design

Figure 3.4. DFD for Synonym Replacement

3.3. DATA FLOW DIAGRAM FOR AUTO-TEXT SUMMARIZER


A flow chart is a graphical or symbolic representation of a process. Each step in the process is
represented by a different symbol and contains a short description of the process step. The flow
chart symbols are related together with arrows showing the process flow direction. [5]
Flow chart of auto text summarizer shows the steps in a process fit together and make tools for
communicating how process performs work and for undoubtedly documenting how a particular
job is done. Flow chart format of auto text summarizer helps to clarify understanding of the
process.
Systems flow chart is used to:

Build a step-by-step picture for extracting summary of electronic document

Auto-Text Summarizer

Page 22

Chapter 4

System Design

Help better understanding of a whole process of summarization.

3.3.1. FLOW CHART COMPONENTS


Auto-Text Summarizer flow chart has:

Rectangles, which show instructions or actions i.e.

Browse File or URL/ Enter Text

Extract Text

Create Sentence List

Count Sentences

Replace Strip Characters with spaces

Create Word List

Create Input Matrix

Calculate SVD

Calculate Sentence Priority

Sort Sentences

Lookup Dictionary

Replace Word

Diamonds, which show decisions i.e.

Whether browsed file is open or close

Auto-Text Summarizer

Page 23

Chapter 4

System Design

File contains any data or not

Word is present in summary as well as in dictionary

Symbols are linked one to the other by arrows, showing the flow of the process.

Auto-Text Summarizer

Page 24

Chapter 4

System Design

Figure 3.4. Flow Chart of Auto-Text Summarizer

3.3.2. TEXT EXTRACTION


For the system to work properly, the first thing is to extract the text from a file for further
processing. Four kinds of documents can be given to system as input:

Auto-Text Summarizer

Page 25

Chapter 4

System Design

MS Word Document

URL of a web page

User Text

Adobe File after converting it to Word Document using existing softwares.

3.3.3. CALCULATE SENTENCE PRIORITY


Calculating sentence priority determines the importance of a particular sentence in a passage.
This calculation makes it easy to extract sentences to be included in a summary. For this system,
LSA technique is used in which an input matrix is created and its SVD is calculated to determine
priority of each sentence.

3.3.4. SENTENCE SELECTION


The next step in LSA is to select sentences to be included in a summary. In the system, Cross
Method approach is used to pick out sentences to be included in summary. The user is given
opportunity to generate summary of a length of his choice. The default length of a summary is
set to 1/5 of length of original document.

3.3.5. WORD REPLACEMENT


In the system, the extraction method of summary is used which means reducing the size of a
document and extracting only important sentences of document. The text of a document is not
changed. In this case, if a user is not good enough in understanding of difficult words of English
language, he may find difficulty in understanding language or getting the main idea of passage.
In this summarizer tool, a feature of word replacement is added to meet the requirements of a
user with bad understanding of English language. An extracted summary is passed to a
dictionary. If a summary contains a word that is also present in a dictionary is replaced with its
synonym word. The synonym is highlighted to enhance readability. The actual word kept in
brackets for the user to be aware of a word which has been replaced.

Auto-Text Summarizer

Page 26

Chapter 4

System Design

Chapter 4
Implementation
4. IMPLEMENTATION
Tool used for creation of the system are visual-studio-2010.

4.1. MICROSOFT VISUAL STUDIO 2012


Microsoft Visual Studio is an integrated development environment (IDE) from Microsoft. It is
used

to

develop console and graphical

user

interface applications along

with Windows

Forms or WPF applications, web sites, web applications, and web services in both code together
with managed

code for

all

platforms

supported

by Microsoft

Windows, Windows

Mobile, Windows CE, .NET Framework, .NET Compact Framework and Microsoft Silver light

4.2. EXTREME OPTIMIZATION NUMERICAL LIBRARIES FOR .NET


Extreme Optimization Numerical Libraries for .Net Framework are built in libraries for solving
complex mathematics problems.

Auto-Text Summarizer

Page 27

Chapter 4

System Design

4.3. ALGORITHM:
// function to create sentence list from input document. Text extracted from a text is input
parameter
CreateSentenceList ( string gExtractedText )
{
Regex objRegex = new Regex(@"(\S.+?[.!?])(?=\s+|$)");
foreach (Match match in objRegex.Matches(ExtractedText))
{
gSntncList.Add(match.Value);
}
}
// function to replace special characters with space. List of sentences is input parameter
ReplaceSpecialCharacters (List gSntncList )
{
gSntncList[i].Replace(character, " ");

}
// function to remove stop words from WordsList created from senetence list. List of stopwords
and list of words are input parameters
RemoveStopWordsFromWordList( string gStopWords, string gWordList)
{
foreach (String word in gStopWords)
{
while (gWordList.Contains(word))
{
gWordList.Remove(word);
}
}
}
//function to create input matrix. List of words and list of sentences are input parameters
CreateInputMatrix(list WordList, list SntnceList)
Auto-Text Summarizer

Page 28

Chapter 4

System Design

{
for (int i = 0; i < gUniqueWordList.Count; i++)
{
for (int j = 0; j < gSntncList.Count; j++)
{
int _cot = 0;
_splitSntnc = gSntncList[j];
foreach (string character in gStripChars)
{
_splitSntnc = _splitSntnc.Replace(character, " ");
}
_splitSntnc = _splitSntnc.ToLower();
List<string> wordList = _splitSntnc.Split(' ').ToList();
for (int k = 0; k < wordList.Count; k++)
{
if (gUniqueWordList[i] == wordList[k])
{
_cot++;
}
}
CountMatrix[i, j] = _cot;
}
// function to calculate SVD of matrix. The input matrix created is input parameter
CalcSVD (CountMatrix )
{
SingularValueDecomposition SVD = MATRIX.GetSingularValueDecomposition();
Extreme.Mathematics.Matrix VT = SVD.RightSingularVectors.Transpose();
Extreme.Mathematics.Vector AVG = (VT.GetRowSums().ToDenseVector() /
VT.ColumnCount);
for (int r = 0; r < VT.RowCount; r++)
{
for (int y = 0; y < VT.ColumnCount; y++)
{
if (VT[r, y] < AVG[r])
VT[r, y] = 0;
}
Auto-Text Summarizer

Page 29

Chapter 4

System Design

}
Extreme.Mathematics.Matrix MUL = SVD.SingularValueMatrix * VT;
SntncLen = MUL.GetColumnSums();
}
// function to calculate priority of sentences on which basis summary will be generated.
CalcSentencePriority()
{
List<double> lstSntncLen = SntncLen.ToList();
var _sorted = lstSntncLen .Select((x, i) => new KeyValuePair<double, int>(x, i))
.OrderByDescending(x => x.Key)
.ToList();
gIdxList = _sorted.Select(x => x.Value).ToList();
}

Auto-Text Summarizer

Page 30

Chapter 4

System Design

Chapter 5
Testing

5. TESTING
Software testing is the procedure of executing a program or system with the goal of finding
errors. By using testing errors can be easily found. Aim of this activity is to evaluate whether the
system is fulfilling the desired goal or not. Software is not like other substantial processes where
input is received and output is generated. Where software varies in the manner in which it does
not succeed. The majority of physical systems fail in a rigid set of ways. In distinction, software
can fail in many strange behaviors. Detecting all of the different breakdown modes for software
is generally infeasible.

Auto-Text Summarizer

Page 31

Chapter 4

System Design

Software testing validates the completed software package functions according to the outlook
defined by the requirements/specifications. On the whole, purpose is to uncover the situations
that could disapprovingly impact the client, usability and/or maintainability.
For testing the software requirements/specs and combination of testing methodologies is applied.
One of the mainly ignored areas of testing is failure testing and error tolerant testing.

5.1. TESTING OF PLAN


Idea for testing of the software is to check

Is the design complete?

Is the design meeting the system necessities and requirements?

Is the design user responsive or not?

Is the system detects error?

Is system compatible?

5.2. TEST CASE 01:


Test Case Administrator

Sara

Functional Area

Browse Word file

Auto-Text Summarizer

Page 32

Chapter 4

System Design

Test Name

Browse Word file from system

Description(Purpose)

The purpose of the test case is to check whether


system word browse file correctly.

Precondition

The system should be ready to run.

Actions to execute

Check if the system browse word file from


system correctly or not.

Expected Results

The system browse word file from the system.

Status

Test case failed.

Table 5.5. Test Case for Browse Word file

5.3. TEST CASE 02:


Test Case Administrator

Atia

Functional Area

Extract text

Auto-Text Summarizer

Page 33

Chapter 4

System Design

Test Name

Extract text from word document

Description(Purpose)

The purpose of the test case is to check whether


the system extracts text from the word file
correctly.

Precondition

The system should be ready to run.

Actions to execute

Check if the system extracts text content from


word file correctly or not.

Expected Results

The system extracts text content from word file.

Status

Test case failed.(System Proceed)

Table 5.6. Test Case for Extract Text from Word file

5.4. TEST CASE 03:

Test Case Administrator

Auto-Text Summarizer

Sara

Page 34

Chapter 4

System Design

Functional Area

Input URL

Test Name

Input URL to system

Description(Purpose)

The purpose of Test Case is to check whether the


system takes URL input.

Precondition

The system should be ready to run.

Actions to execute

Check if the system takes URL input correctly or


not.

Expected Results

The system takes URL input.

Status

Test case failed. (System Proceed)

Table 5.7. Test Case for URL input

5.5. TEST CASE 04:

Auto-Text Summarizer

Page 35

Chapter 4

System Design

Test Case Administrator

Atia

Functional Area

Extract Text

Test Name

Extract Text from URL

Description(Purpose)

The purpose of this test case is to check whether


the system extracts text from URL.

Precondition

The system should be ready to run.

Actions to execute

Check whether the system extracts text from


URL correctly or not.

Estimated Results

The system successfully extracts text from input


URL.

Status

Test case failed. (System Proceed)

Table 5.8. Test case for Extract Text from URL

5.6. TEST CASE 05:

Auto-Text Summarizer

Page 36

Chapter 4

System Design

Test Case Administrator

Sara

Functional Area

User Text

Test Name

Take user input text

Description(Purpose)

The purpose of this test case is to check whether


the system reads user text.

Precondition

The system should be ready to run.

Actions to execute

Check whether the system reads user text


correctly or not.

Estimated Results

The system successfully reads the user text.

Status

Test case failed. (System Proceed)

Table 5.5. Test Case for Reading of User Text

5.7. TEST CASE 06:

Test Case Administrator

Auto-Text Summarizer

Atia

Page 37

Chapter 4

System Design

Functional Area

Error Message

Test Name

Error Message for file not closed

Description(Purpose)

The purpose of this test case is to check whether


the system displays proper error message when
file is open or not closed at time of browsing.

Precondition

The system should be ready to run.

Actions to execute

Check whether the system displays proper error


message when file is open at time of browsing or
not.

Estimated Results

The system successfully displays proper error


message when file is open at time of browsing.

Status

Test case failed. (System Proceed)

Table 5.6. Test Case for Error Message

5.8. TEST CASE 07:

Test Case Administrator

Auto-Text Summarizer

Sara

Page 38

Chapter 4

System Design

Functional Area

Error Message

Test Name

Error Message when no text found

Description(Purpose)

The purpose of this test case is to check whether


the system displays proper error message when
there is no data to summarize.

Precondition

The system should be ready to run.

Actions to execute

Check whether the system displays proper error


message when there is no data to summarize or
not.

Estimated Results

The system successfully displays proper error


message when there is no data to summarize.

Status

Test case failed. (System Proceed)

Table 5.7. Test Case for Error Message

5.9. TEST CASE 08:

Test Case Administrator

Auto-Text Summarizer

Atia

Page 39

Chapter 4

System Design

Functional Area

Generate summary

Test Name

Summary Generation

Description(Purpose)

The purpose of this test case is to check whether


the system generates summary.

Precondition

The system should be ready to run.

Actions to execute

Check whether the system generates summary or


not.

Estimated Results

The system successfully generates summary.

Status

Test case failed. (System Proceed)

Table 5.8. Test Case for Summary Generation

5.10. TEST CASE 09:

Test Case Administrator

Auto-Text Summarizer

Sara

Page 40

Chapter 4

System Design

Functional Area

Word replacement

Test Name

Replacement of word in summary

Description(Purpose)

The purpose of this test case is to check whether


the system replaces the word in summary with its
synonym.

Precondition

The system should be ready to run.

Actions to execute

Check whether the system replaces word with its


synonym when found in dictionary or not.

Estimated Results

The system successfully replaces the word in


summary which is also present in dictionary with
its synonym.

Status

Test case failed. (System Proceed)

Table 5.9. Test Case for Synonym Replacement

5.11. TESTING AND ANALYSIS:


By testing the system gives good result to the user as it gives complete information. It is time
and memory efficient as it gives results in minimum amount of time.
Auto-Text Summarizer

Page 41

Chapter 4

System Design

Chapter 6
Conclusion

Auto-Text Summarizer

Page 42

Chapter 4

System Design

6. CONCLUSION
It is hard to imagine life without some form of summarization. Newspaper headlines are
summaries. A preview or trailer of a show is a summary. Abstracts of scientific articles are a
traditional form of summary, written by the authors, or else by a professional abstractor
following certain guideline. A table showing baseball statistics for a player over a season is very
much a summary. Other varieties of summaries include reviews (of books and movies), digests
such as TV guides, minutes of a meeting, a program for a conference, a stock market bulletin, a
resume, an obituary, an abridgment of a book, a library catalog of abstracts of articles in new
journals, a table of content for a book or magazine, a summary that appears on a back cover of a
book. Almost any retrospective account of events could be a summary.
So, a summarizer is a system whose goal is to produce a condensed representation of the content
of its input for human consumption. A text summarizer does same for text input only.
The goal of this auto-text summarizer tool is to take text information from source, extract
content from it, and present the most important content to the user in a condensed form and in a
manner sensitive to the users or applications need. The concept of SVD is used to determine the
priority of sentences in a document. Extreme Optimization Numerical Library is used for
complex mathematical calculations. The system works only for text input.
The system maintains a dictionary. When a summary is generated, it checks for the words in a
summary and if it finds a word which is also present in a dictionary, it replaces that word with its
synonym.

6.1. GOOD FEATURES


Auto-text summarizer has a user friendly interface. It is a reliable and efficient system. It
provides proper information to the user. This system is much flexible for changings and is easily
maintainable for further betterment of the system.

Auto-Text Summarizer

Page 42

Chapter 4

System Design

6.2. FUTURE ENHANCEMENTS


Future enhancements can be done in such a way that the can system can be made web based. The
tool may be provided on the World Wide Web for greater ease.
Current system works purely for English language. The same system can be made to work for
different languages.
The word replacement feature can be made more meaningful by categorizing a word as noun,
verb, adjective, adverb etc. and then replace accordingly.

Auto-Text Summarizer

Page 43

Chapter 4

System Design

APPENDIX

USER MANUAL
WORKING OF SYSTEM
Auto-Text Summarizer

Page 43

Chapter 4

System Design

The system working is described below.

APPLICATION MAIN INTERFACE

SUMMARIZE WORD FILE


Click on file option; go to Summarizer Word File option. A dialog box will appear. Browse the
required word file. The text of input document will appear on Original Text Tab and the extracted
summary will appear on Summary tab.
Auto-Text Summarizer

Page 49

Chapter 4

System Design

OR
Click on Summarize Word File option on tool strip. A dialog box will appear. Open the desired
word file. The text of input document will appear on Original Text Tab and the extracted
summary will appear on Summary tab.

Auto-Text Summarizer

Page 50

Chapter 4

Auto-Text Summarizer

System Design

Page 51

Chapter 4

System Design

SUMMARIZE WEB PAGE


Click on file option and go to Summarizer Web page option. A dialog box will appear. Enter the
Url of required web page to summarize and click OK. The text of input document will appear on
Original Text Tab and the extracted summary will appear on Summary tab.
OR
Click on Summarize Web page option on tool strip. A dialog box will appear. Open the desired
web page. The text of input document will appear on Original Text Tab and the extracted
summary will appear on Summary tab.

Auto-Text Summarizer

Page 52

Chapter 4

Auto-Text Summarizer

System Design

Page 53

Chapter 4

System Design

SUMMARIZE USER TEXT


Click on file option and go to Summarizer User Text option. The interface will be changed. Enter
the text and click on Summarize button when you are done with entering text. The extracted
summary will appear on Summary tab.

Auto-Text Summarizer

Page 54

Chapter 4

Auto-Text Summarizer

System Design

Page 55

Chapter 4

System Design

SUMMARY LENGTH
By default, summary length is set to 1/5th of length of original document. After default summary
is generated, user can increase the length of summary. Maximum length of summary is equal to
the length of original document. The length of summary in percentage is also displayed for user
convenience. When a user increase summary length, a sentence is added in a summary which is
highlighted.

Auto-Text Summarizer

Page 56

Chapter 4

System Design

SUMMARY WORDS
Summary words tell the number of words in a summary. It changes with the change in summary
length by the user.

Auto-Text Summarizer

Page 57

Chapter 4

System Design

MODE
System provides two modes of summary extraction.

Sorted
Unsorted

Sorted Summary Mode:


The default setting for mode is sorted. Sorted summary mode selects the sentences to be added in
summary by checking priority of sentence as well as the position of sentence in original
document.

Unsorted Summary Mode:


Unsorted summary mode selects the sentences to be added in summary by checking priority of
sentences.

Auto-Text Summarizer

Page 58

Chapter 4

System Design

WORDS REPLACED
When word replacement parser finds any word in summary which is also present in dictionary, it
replaces the word with its synonym and displays the original word in round brackets to increase
readability.

The word replaced is also shown in a side panel under the heading Words

Replaced giving information about which word is replaced and with what. The color of a
synonym in a summary is also changed to achieve goal of readability.

Auto-Text Summarizer

Page 59

Chapter 4

System Design

RELATED CONCEPTS
Related concepts are the words which gives user an idea of topics discussed in a content. They
are separated for user convenience. They appear on the left side in a panel below the heading
Related Concepts.

Auto-Text Summarizer

Page 60

Chapter 4

System Design

PATH
A panel on an interface displays the path of file browsed. If a summary of web page is required it
displays the URL of that web page and shows User Text when Summarize User Text is
selected. When no file is selected, it shows No Document telling user that no document has
been given as input to summarize.

Auto-Text Summarizer

Page 61

Chapter 4

System Design

EXPORT SUMMARY
Auto-Text Summarizer

Page 62

Chapter 4

System Design

When user is done with generating summary, he/she may want to save summary in his/her
system for future use. The auto-text summarizer provide user with a facility of saving summary.
In the menu bar, click File, go to Export Summary option. The system provide user with two
options here. Either user can save summary as a word document or as an adobe document. A
dialog box will appear asking user to give name to file and select the path where to save a
summary document. The option is disabled until summary is generated.

EDIT SUMMARY
Auto-Text Summarizer

Page 63

Chapter 4

System Design

In a menu bar, an option Edit is used to enable or disable editing in summary. It is enabled when
the summary is generated and the summary tab is selected.

REFRENCES

Auto-Text Summarizer

Page 64

Chapter 4

System Design

REFERENCES

[1] Moawad I.F, Aref M. Semantic graph reduction approach for abstractive text
summarization Inf.

Syst. Dept., Ain Shams Univ., Cairo, Egypt. 27 November

2013<http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=6408498&sortType
%3Ddesc_p_Publication_Year%26queryText%3DSemantic+Summary>
[2] Makbule Gulcin Ozsoy, Ilyas Cicekli, Ferda Nur Alpaslan. Text Summarization of Turkish
Text using Latent Semantic Analysis 23rd International Conference on Computational
Linguistics

(Coling

2010),

Beijing.

August

2010<

http://dl.acm.org/citation.cfm?

id=1873781.1873879&coll=DL&dl=GUIDE&CFID=252479036&CFTOKEN=32208089 >
[3] http://en.wikipedia.org/wiki/Microsoft_Visual_Studio
[4] http://www.cs.utexas.edu/~mooney/cs388/
[5] Hebb, Nicholas. what is a flow chart. Flowbreeze.web. 6 july 2013
[6] Inderjeet Mani. Automatic Summarization. (John Benjamins Publishing Company: The
Netherlands), 2001.
Auto-Text Summarizer

Page 65

Chapter 4

System Design

[7] http://www.copernic.com/en/products/summarizer/
[8] http://en.wikipedia.org/wiki/Automatic_summarization
[9] Dipanjas Das, Andre F.T. Martins. A Survey on Auto-Text Summarization November 21,
2007
[10] http://www.csc.kth.se/~xmartin/papers/licthesis_xmartin_notrims.pdf
[11] https://www.classle.net/projects/project_ideas/development-auto-summarization-tool

[12] Laura Alonso, Irene Castellon, Salvador Climent, Maria Fuentes, Lluis Padro, Horacio
Rodriguez.

Comparative

Study

of

Automated

Text

Summarization<

https://www.google.com.pk/url?
sa=f&rct=j&url=http://cv.uoc.es/~grc0_001091_web/files/ComparisonOfSummarisationSystems
.pdf&q=&esrc=s&ei=uc9WUpOGsKt4AShjoGQCQ&usg=AFQjCNEFHnoPGbgFEYLf9ltDNrek00Erw>
[13] Klaus Zechner. A Literature Survey on Information Extraction and Text Summarization,
14 April 1997< https://www.google.com.pk/#q=infoextr.pdf >
[14] Daniel McDonald, Hsinchun Chen. Using Sentence Selection Heuristics to Rank Text
Segments

in

TXTRACTOR<

https://www.google.com.pk/#q=using_sentence_selection_heuristics_to_rank_text_segments_in_
txtractor.pdf >
[15] http://summarizer.intellexer.com/
[16] http://www.youtube.com/watch?v=TD8kidi-3fU

Auto-Text Summarizer

Page 66

Vous aimerez peut-être aussi