Académique Documents
Professionnel Documents
Culture Documents
AUTO-TEXT SUMMARIZER
Developed by
Atia Akram [1841-FBAS/BSCS/F09]
Sara Waheed [1842-FBAS/BSCS/F09]
Supervised by
Ms. Maria Ashraf
Page 1
References
2013
Page 2
References
No Prayer is Unanswered
&
No Prayer is Unheard
Auto-Text Summarizer
Page 3
Final Approval
Final Approval
Dated: _________
It is certified that we have read the project report submitted by Atia Akram(1841FBAS/BSCS/F09) and Sara Waheed (1842-FBAS/BSCS/F09) and it is our judgment that this
project is of sufficient standard to warrant its acceptance by the International Islamic University,
Islamabad for the BS Degree in Computer Science.
Committee
____________________
External Examiner
Internal Examiner
____________________
Ms. Saleha
Lecturer,
International Islamic University, Islamabad.
Supervisor
Ms. Maria Ashraf
____________________
Lecturer,
International Islamic University, Islamabad.
Auto-Text Summarizer
Page 1
Dedication
DEDICATION
We dedicate our project to
Our Dear parents,
Family,
Respected teachers,
and especially to Our Department of Computer Science
And all those who prayed, supported and encouraged us to achieve our
goals.
Auto-Text Summarizer
Page 2
Dissertation
Auto-Text Summarizer
Page 3
Declaration
DECLARATION
We hereby declare that this Software, neither as a whole nor as a part thereof has been copied out
from any source. It is further declared that we have developed this software entirely on the basis
of our personal efforts made under the sincere guidance of our teachers and supervisor.
No portion of the work presented in this report has been submitted in support of any application
for any other degree or qualification of this or any other university or institute of learning.
Atia Akram
(1841-FBAS/BSCS/F09)
Sara Waheed
(1842-FBAS/BSCS/F09)
Auto-Text Summarizer
Page 4
Acknowledgement
ACKNOWLEDGEMENT
First of all, the praise to Allah Almighty whose guidance let us to follow none but the right path
and accomplish the task in limited time. Our prestigious Institution IIUI (International Islamic
University Islamabad), which endowed us with another opportunity to prove ourselves being
capable of doing such a significant task.
Our supervisor, Ms Maria Ashraf whose friendly attitude and noteworthy supervision enabled
us to work in harmony and in a peaceful environment. Our families always supported us in the
hour of need. Thanks to all other friends and colleagues, whose untiring efforts made our project
not a hypothetical but a practical one.
Auto-Text Summarizer
Page 5
Project in Brief
PROJECT IN BRIEF
Project Title:
Auto-Text Summarizer
Objective:
Undertaken By:
Atia Akram
Sara Waheed
Supervised By:
Date Started:
February,2013
Date Completed:
August, 2013
Tools Used:
System used
Auto-Text Summarizer
Page 6
Abbreviations
ABBREVIATIONS
NLP
LSA
SVD
HTML
URL
DFD
Abstract
ABSTRACT
In NLP, auto-text summarization is the process of summarizing large documents extracting most
important points of documents using computers computational power. Generating summary of
text document with all the important points in a document is one of the most critical elements to
Auto-text summarizers.
This system generates summary of electronic documents retaining important points of a
document. Sentence Selection and Cross Method algorithms are used to calculate priority of
sentences and text summarization. The system replaces the strong vocabulary words in a
summary with their synonyms by maintaining a dictionary.
Table of Contents
Contents
1.
Introduction...................................................................................................................................................1
Table of Contents
2.5.1. System Reliability......................................................................................................................................12
2.5.2. System Usability.........................................................................................................................................12
2.5.3. Efficiency....................................................................................................................................................12
3. System Design...................................................................................................................................................14
3.1.1. Input............................................................................................................................................................14
3.1.2. Text Pre Processing....................................................................................................................................14
3.1.3. Sentence Separator....................................................................................................................................14
3.1.4. Replacement Of Special Characters With Spaces..................................................................................15
3.1.5. List Of Words.............................................................................................................................................15
3.1.6. Stop Words Eliminator..............................................................................................................................15
3.1.7. List Of Unique Words...............................................................................................................................15
3.1.8. Latent Semantic Analysis..........................................................................................................................15
3.1.8.1. Input Matrix Creation............................................................................................................................16
3.1.8.2. Svd Value.................................................................................................................................................16
3.1.8.3. Sentence Selection...................................................................................................................................16
3.1.9. Text Summarization...................................................................................................................................16
3.1.9.1. Cross Method..........................................................................................................................................16
3.1.10. Saving Extracted Summary....................................................................................................................17
3.2. Data Flow Diagram.......................................................................................................................................18
3.2.1. Process........................................................................................................................................................18
3.2.2. Context Level Dfd For Auto-Text Summarizer.......................................................................................19
3.2.3. Level 1 Dfd For Auto-Text Summarizer..................................................................................................20
3.2.4. Level 2 Dfd For Auto-Text Summarizer..................................................................................................21
3.2.5. Level 2 Dfd For Auto-Text Summarizer..................................................................................................22
3.3. Data Flow Diagram For Auto-Text Summarizer.......................................................................................22
3.3.1. Flow Chart Components...........................................................................................................................23
3.3.3. Calculate Sentence Priority......................................................................................................................26
3.3.4. Sentence Selection......................................................................................................................................26
Table of Contents
3.3.5. Word Replacement....................................................................................................................................26
4. Implementation................................................................................................................................................29
4.2. Extreme Optimization Numerical Libraries For .Net...............................................................................29
5. Testing...............................................................................................................................................................33
5.1. Testing Of Plan..............................................................................................................................................33
5.2. Test Case 01...................................................................................................................................................34
5.3. Test Case 02...................................................................................................................................................35
5.4. Test Case 03...................................................................................................................................................36
5.5. Test Case 04...................................................................................................................................................37
5.6. Test Case 05...................................................................................................................................................38
5.7. Test Case 06...................................................................................................................................................39
5.8. Test Case 07...................................................................................................................................................40
5.9. Test Case 08...................................................................................................................................................41
5.10. Test Case 09.................................................................................................................................................42
6. Conclusion........................................................................................................................................................42
6.1. Good Features...............................................................................................................................................42
6.2. Future Enhancements..................................................................................................................................43
User Manual............................................................................................................................................................48
Working Of System..............................................................................................................................................48
Path.......................................................................................................................................................................60
Export Summary.................................................................................................................................................62
References................................................................................................................................................................65
Table of Contents
Table of Tables
Table 1.1. Problem of Unavailability of summarizers with word replacement feature..3
Table 3.1. Product Positioning Statement.11
Table 5.1. Test Case for Browse Word file...31
Table 5.2. Test Case for Extract Text from Word file.32
Table 5.3. Test Case for URL input..33
Table 5.4. Test case for Extract Text from URL..34
Table 5.5. Test Case for Reading of User Text.35
Table 5.6. Test Case for Error Message...36
Table 5.7. Test Case for Error Message...37
Table 5.8. Test Case for Summary Generation.38
Table 5.9. Test Case for Synonym Replacement..39
Table of Contents
Table of Figures
Figure 3.1. Example of Cross Method Approach.17
Figure 3.2. Context DFD for Auto-Text Summarizer...19
Figure 3.3 level 1 DFD for Summary Generation and Synonym Replacement...........20
Figure 3.4. Level 2 DFD of Internal Working......21
Figure 3.5. DFD for Synonym Replacement....22
Figure 3.6. Flow Chart of Auto-Text Summarizer....24
CHAPTER 1
INTRODUCTION
1. INTRODUCTION
One of the important NLP applications is Text Summarization, which helps users to manage the
vast amount of information available, by condensing documents' content and extracting the most
relevant facts or topics included[1]. Summarizing a text means giving a brief statement about the
main points of some paragraph or document and dispensing needless details or formalities.
Before going deep into the system, we will first describe the need of a tool capable of
summarizing large amount of data retaining the main points and concepts of document. In
todays Internet age, the information is overwhelmed. The internet stores large amount of data
about everything. One may not want to go through from an entire page to get the information of
his need. Summarizing a particular document may help a user to get the information of his
interest from a huge amount of data.
Summaries can be of forms
Extractive summaries
Abstractive summaries
Extractive summaries take important sentences from original document to generate summary.
Abstractive summaries do not take sentences from original document rather capture important
concepts in the original document and generate new sentences. Abstractive summarization
approach is similar to the way that human summarizers follow.
Summarization methods are categorized according to what they generate and how they generate.
A summary can be generated from a single document, known as single document summarization
and multiple documents, called multi-document summarization.
Summaries can also be categorized as
Generic summaries
Query-Based summaries
The summarization tools provide the facility to a user to give a document to the tool to generate a
summary of a document so that the user may save his time by not reading the entire large
document but picking up the concept described in the document through the generated summary.
This proposed project is a summarization tool which is a desktop application that comforts the
user eliminating the need of reading large documents. The tool works by taking a single
document from user, extracting the important sentences of document and providing a summary
to user. In the proposed project, the generated summary is made more readable by replacing the
words in the summary with their synonyms. The system replaces those words in the summary
with their meanings which are also present in dictionary. Dictionary is maintained by the system.
The problems due to which this tool, Auto-Text Summarizer has been built are described below:
The problem of
Affects
understanding of English
Computer users specifically those using for reading and
searching purpose.
The basic problem with the existing system is unavailability of any mechanism to
An easy to use interface where user can get desired output in a relatively easier way.
A system where a user can get a summary of a document or text in a customized
Features
Summarize
The outcome of this system generates summary of documents of different types (html document,
word document, user specified text, adobe documents after converting it to word document using
existing softwares) with the strong vocabulary words (present in dictionary maintained by
system) replaced with their synonyms so that extracted summary is relatively easier to
understand.
Text Extraction
Text Summarization
Replacement of words with synonyms
1.10. ALGORITHMS
1.11. TOOLS
Following are the tools which are used to build the system:
to
user
with Windows
Forms or WPF applications, web sites, web applications, and web services in both code together
with managed
code for
all
platforms
supported
by Microsoft
Windows, Windows
Mobile, Windows CE, .NET Framework, .NET Compact Framework and Microsoft Silver
light[3].
Visual Studio includes a code editor supporting IntelliSense as well as code refactoring. The
integrated debugger works both as a source-level debugger and a machine-level debugger. Other
built-in tools include a forms designer for building GUI applications, designer, class designer,
and database schema designer. It accepts plug-ins that enhance the functionality at almost every
levelincluding
adding
support
SourceSafe) and adding new toolsets like editors and visual designers for domain or toolsets for
other aspects of the software development lifecycle .
CHAPTER 2
SYSTEM ANALYSIS
Chapter 3
System Analysis
2. REQUIREMENT ANALYSIS
The initial step of the software process is the requirement analysis which provides with a
representation of information, function and behavior of software.
The primary objective of the requirement analysis phase is to understand and document the
business needs and the processing requirements of the new system. Analysis is essentially a
discovery process to gain knowledge. Gathering information is a basic important part of analysis.
Studies reveal an inadequate attention to software requirement analysis. At the beginning of a
project is the most common cause for critically vulnerable projects that often do not deliver even
on the basic tasks for which they were designed.
The principal objective of the systems analysis phase is the specification of what the system
needs to do to meet the requirements of end users.
In the systems design phase system specifications are converted to a hierarchy of charts that
define the data required .The processes to be carried out on the data, so that they can be
expressed as instructions of a computer program. Many information systems are implemented
with generic software, rather than with such custom-built programs. In order to define its goals
or purposes and to discover operations and procedures for accomplishing them most efficiently.
2.1. REQUIREMENT
A requirement is a characteristic of the system or an explanation of something the system is
capable of doing in order to fulfill the systems purpose.
Requirements analysis is done in iterative manner with functional analysis to optimize
performance requirements for identified functions, and to verify that synthesized solutions can
satisfy customer requirements.
Auto-Text Summarizer
Page 8
Chapter 3
System Analysis
Auto-Text Summarizer
Page 9
Chapter 3
System Analysis
Auto-Text Summarizer
Page 10
Chapter 3
System Analysis
For
Who
The
Auto-Text
Summarizer
is
an
English
text
summarization tool
That
Unlike
in
comparatively
easier
English
language.
Traditional and non user friendly summarization
tools.
Our product
Auto-Text Summarizer
Page 11
Chapter 3
System Analysis
2.5.3. EFFICIENCY
The application is efficient than previous available summarizers.
Auto-Text Summarizer
Page 12
CHAPTER 3
SYSTEM DESIGN
Chapter 4
System Design
3. SYSTEM DESIGN
Systems design is the process of defining the architecture, structural design, mechanism,
modules, interfaces, and data for a system to assure specific requirements. The objective of the
design phase is to design the solution system based on the requirements defined and decisions
made during analysis. Systems design could be seen as the application of systems
theory to product development. There are common characteristics with the disciplines of systems
analysis, systems architecture and engineering. It shows the designing of the system for defining
the solution of the problem.
3.1. ALGORITHM
3.1.1. INPUT
Input is the electronic document of which summary is to be generated. The input document can
be any MS Word file (*.doc or *.docx), an HTML document (web page), a plain text (user
defined) or an Adobe file (*.pdf) after converting it to Word file using existing softwares. Input is
taken by browsing a file from a personal computer.
Auto-Text Summarizer
Page 14
Chapter 4
System Design
Auto-Text Summarizer
Page 15
Chapter 4
System Design
Page 16
Chapter 4
System Design
the core sentences representing the topic, but they are related to the topic in some way. The
preprocessing step removes the overall effect of such sentences.
In Cross approach, first the cell values are multiplied with the values in the matrix, and the
total lengths of sentence vectors, which are represented by the columns of the V^T matrix, are
calculated. Then, the longest sentence vectors are collected as a part of the resulting summary.
Example
In figure below, an example V^T matrix is given. First, the average scores of all concepts are
calculated, and the cells whose values are less than the average value of their row are set to zero.
The boldface numbers are below row averages in figure, and they are set to zero before the
calculation of the length scores of sentences. Then, the length score of each sentence is
calculated by adding up the concept scores of sentences in the updated matrix. In the end, the
sentence sent1 is chosen for the summary as the first sentence, since it has the highest length
score.
Con0
Con1
Con2
Con3
Con4
Length
Sent0
0,557
0,345
0,732
0,628
0,557
1,846
Sent1
0,691
0,674
0,232
0,436
0,691
2,056
Sent2
0,241
0,742
0,435
0,783
0,241
1,960
Sent3
0,110
0,212
0,157
0,865
0,710
1,575
Average
0,399
0,493
0,389
0,678
0,549
Auto-Text Summarizer
Page 17
Chapter 4
System Design
Data flow diagram shows the system processes the data. It is used to represent the relationship
between components of the program. It is a technique for modeling a scheme high-level detail by
presenting how input information is changed to productive results through a series of useful
transformations.
DFDs are easier and simpler to understand by scientific and non-technical audiences. It can
present a high level system general idea, absolute with limitations and relations to other systems.
It also grants a thorough image of system mechanism.
3.2.1. PROCESS
The process is the handling that transforms information, performing computations, building
decisions (logic flow), or directing data flows based on production rules. In other words, a
process receives participation and generates some production.
Auto-Text Summarizer also performs processes that are described in the form of DFD.
Following are some processes that show how data is transformed and computed.
Extracting Summary
Checking Dictionary
Data flow diagram of auto-text summarizer gives a simple but authoritative graphic method
which is straightforwardly understandable. It gives view point of data movements which
consists of the inputs and outputs that represents the flow of information. The ability to
represent the system at different levels of details gives added advantage.
Auto-Text Summarizer
Page 18
Chapter 4
System Design
Working starts from taking input a document or user text and end in generating summary of the
document. After summary is generated, it is checked to look for the difficult words which are
also present in the dictionary to replace those words with their meanings.
Page 19
Chapter 4
System Design
Figure 3.1 level 1 DFD for Summary Generation and Synonym Replacement
The system has two main modules. The first major module is to extract summary of a given input
document and the second module is the replacement of difficult words in a summary with their
synonyms which are also present in a dictionary.
Auto-Text Summarizer
Page 20
Chapter 4
System Design
Initially, a user browse a file. The file can be any Word file, URL of a web page or user specified
text. Then the text from the input file is extracted for further processing. A regular expression
separates sentences of extracted text. Separated sentences are stored in a list. The system
calculates the priority of sentences. Priority determines the importance of sentence. The
sentences are then selected to be added in summary.
Auto-Text Summarizer
Page 21
Chapter 4
System Design
Auto-Text Summarizer
Page 22
Chapter 4
System Design
Extract Text
Count Sentences
Calculate SVD
Sort Sentences
Lookup Dictionary
Replace Word
Auto-Text Summarizer
Page 23
Chapter 4
System Design
Symbols are linked one to the other by arrows, showing the flow of the process.
Auto-Text Summarizer
Page 24
Chapter 4
System Design
Auto-Text Summarizer
Page 25
Chapter 4
System Design
MS Word Document
User Text
Auto-Text Summarizer
Page 26
Chapter 4
System Design
Chapter 4
Implementation
4. IMPLEMENTATION
Tool used for creation of the system are visual-studio-2010.
to
user
with Windows
Forms or WPF applications, web sites, web applications, and web services in both code together
with managed
code for
all
platforms
supported
by Microsoft
Windows, Windows
Mobile, Windows CE, .NET Framework, .NET Compact Framework and Microsoft Silver light
Auto-Text Summarizer
Page 27
Chapter 4
System Design
4.3. ALGORITHM:
// function to create sentence list from input document. Text extracted from a text is input
parameter
CreateSentenceList ( string gExtractedText )
{
Regex objRegex = new Regex(@"(\S.+?[.!?])(?=\s+|$)");
foreach (Match match in objRegex.Matches(ExtractedText))
{
gSntncList.Add(match.Value);
}
}
// function to replace special characters with space. List of sentences is input parameter
ReplaceSpecialCharacters (List gSntncList )
{
gSntncList[i].Replace(character, " ");
}
// function to remove stop words from WordsList created from senetence list. List of stopwords
and list of words are input parameters
RemoveStopWordsFromWordList( string gStopWords, string gWordList)
{
foreach (String word in gStopWords)
{
while (gWordList.Contains(word))
{
gWordList.Remove(word);
}
}
}
//function to create input matrix. List of words and list of sentences are input parameters
CreateInputMatrix(list WordList, list SntnceList)
Auto-Text Summarizer
Page 28
Chapter 4
System Design
{
for (int i = 0; i < gUniqueWordList.Count; i++)
{
for (int j = 0; j < gSntncList.Count; j++)
{
int _cot = 0;
_splitSntnc = gSntncList[j];
foreach (string character in gStripChars)
{
_splitSntnc = _splitSntnc.Replace(character, " ");
}
_splitSntnc = _splitSntnc.ToLower();
List<string> wordList = _splitSntnc.Split(' ').ToList();
for (int k = 0; k < wordList.Count; k++)
{
if (gUniqueWordList[i] == wordList[k])
{
_cot++;
}
}
CountMatrix[i, j] = _cot;
}
// function to calculate SVD of matrix. The input matrix created is input parameter
CalcSVD (CountMatrix )
{
SingularValueDecomposition SVD = MATRIX.GetSingularValueDecomposition();
Extreme.Mathematics.Matrix VT = SVD.RightSingularVectors.Transpose();
Extreme.Mathematics.Vector AVG = (VT.GetRowSums().ToDenseVector() /
VT.ColumnCount);
for (int r = 0; r < VT.RowCount; r++)
{
for (int y = 0; y < VT.ColumnCount; y++)
{
if (VT[r, y] < AVG[r])
VT[r, y] = 0;
}
Auto-Text Summarizer
Page 29
Chapter 4
System Design
}
Extreme.Mathematics.Matrix MUL = SVD.SingularValueMatrix * VT;
SntncLen = MUL.GetColumnSums();
}
// function to calculate priority of sentences on which basis summary will be generated.
CalcSentencePriority()
{
List<double> lstSntncLen = SntncLen.ToList();
var _sorted = lstSntncLen .Select((x, i) => new KeyValuePair<double, int>(x, i))
.OrderByDescending(x => x.Key)
.ToList();
gIdxList = _sorted.Select(x => x.Value).ToList();
}
Auto-Text Summarizer
Page 30
Chapter 4
System Design
Chapter 5
Testing
5. TESTING
Software testing is the procedure of executing a program or system with the goal of finding
errors. By using testing errors can be easily found. Aim of this activity is to evaluate whether the
system is fulfilling the desired goal or not. Software is not like other substantial processes where
input is received and output is generated. Where software varies in the manner in which it does
not succeed. The majority of physical systems fail in a rigid set of ways. In distinction, software
can fail in many strange behaviors. Detecting all of the different breakdown modes for software
is generally infeasible.
Auto-Text Summarizer
Page 31
Chapter 4
System Design
Software testing validates the completed software package functions according to the outlook
defined by the requirements/specifications. On the whole, purpose is to uncover the situations
that could disapprovingly impact the client, usability and/or maintainability.
For testing the software requirements/specs and combination of testing methodologies is applied.
One of the mainly ignored areas of testing is failure testing and error tolerant testing.
Is system compatible?
Sara
Functional Area
Auto-Text Summarizer
Page 32
Chapter 4
System Design
Test Name
Description(Purpose)
Precondition
Actions to execute
Expected Results
Status
Atia
Functional Area
Extract text
Auto-Text Summarizer
Page 33
Chapter 4
System Design
Test Name
Description(Purpose)
Precondition
Actions to execute
Expected Results
Status
Table 5.6. Test Case for Extract Text from Word file
Auto-Text Summarizer
Sara
Page 34
Chapter 4
System Design
Functional Area
Input URL
Test Name
Description(Purpose)
Precondition
Actions to execute
Expected Results
Status
Auto-Text Summarizer
Page 35
Chapter 4
System Design
Atia
Functional Area
Extract Text
Test Name
Description(Purpose)
Precondition
Actions to execute
Estimated Results
Status
Auto-Text Summarizer
Page 36
Chapter 4
System Design
Sara
Functional Area
User Text
Test Name
Description(Purpose)
Precondition
Actions to execute
Estimated Results
Status
Auto-Text Summarizer
Atia
Page 37
Chapter 4
System Design
Functional Area
Error Message
Test Name
Description(Purpose)
Precondition
Actions to execute
Estimated Results
Status
Auto-Text Summarizer
Sara
Page 38
Chapter 4
System Design
Functional Area
Error Message
Test Name
Description(Purpose)
Precondition
Actions to execute
Estimated Results
Status
Auto-Text Summarizer
Atia
Page 39
Chapter 4
System Design
Functional Area
Generate summary
Test Name
Summary Generation
Description(Purpose)
Precondition
Actions to execute
Estimated Results
Status
Auto-Text Summarizer
Sara
Page 40
Chapter 4
System Design
Functional Area
Word replacement
Test Name
Description(Purpose)
Precondition
Actions to execute
Estimated Results
Status
Page 41
Chapter 4
System Design
Chapter 6
Conclusion
Auto-Text Summarizer
Page 42
Chapter 4
System Design
6. CONCLUSION
It is hard to imagine life without some form of summarization. Newspaper headlines are
summaries. A preview or trailer of a show is a summary. Abstracts of scientific articles are a
traditional form of summary, written by the authors, or else by a professional abstractor
following certain guideline. A table showing baseball statistics for a player over a season is very
much a summary. Other varieties of summaries include reviews (of books and movies), digests
such as TV guides, minutes of a meeting, a program for a conference, a stock market bulletin, a
resume, an obituary, an abridgment of a book, a library catalog of abstracts of articles in new
journals, a table of content for a book or magazine, a summary that appears on a back cover of a
book. Almost any retrospective account of events could be a summary.
So, a summarizer is a system whose goal is to produce a condensed representation of the content
of its input for human consumption. A text summarizer does same for text input only.
The goal of this auto-text summarizer tool is to take text information from source, extract
content from it, and present the most important content to the user in a condensed form and in a
manner sensitive to the users or applications need. The concept of SVD is used to determine the
priority of sentences in a document. Extreme Optimization Numerical Library is used for
complex mathematical calculations. The system works only for text input.
The system maintains a dictionary. When a summary is generated, it checks for the words in a
summary and if it finds a word which is also present in a dictionary, it replaces that word with its
synonym.
Auto-Text Summarizer
Page 42
Chapter 4
System Design
Auto-Text Summarizer
Page 43
Chapter 4
System Design
APPENDIX
USER MANUAL
WORKING OF SYSTEM
Auto-Text Summarizer
Page 43
Chapter 4
System Design
Page 49
Chapter 4
System Design
OR
Click on Summarize Word File option on tool strip. A dialog box will appear. Open the desired
word file. The text of input document will appear on Original Text Tab and the extracted
summary will appear on Summary tab.
Auto-Text Summarizer
Page 50
Chapter 4
Auto-Text Summarizer
System Design
Page 51
Chapter 4
System Design
Auto-Text Summarizer
Page 52
Chapter 4
Auto-Text Summarizer
System Design
Page 53
Chapter 4
System Design
Auto-Text Summarizer
Page 54
Chapter 4
Auto-Text Summarizer
System Design
Page 55
Chapter 4
System Design
SUMMARY LENGTH
By default, summary length is set to 1/5th of length of original document. After default summary
is generated, user can increase the length of summary. Maximum length of summary is equal to
the length of original document. The length of summary in percentage is also displayed for user
convenience. When a user increase summary length, a sentence is added in a summary which is
highlighted.
Auto-Text Summarizer
Page 56
Chapter 4
System Design
SUMMARY WORDS
Summary words tell the number of words in a summary. It changes with the change in summary
length by the user.
Auto-Text Summarizer
Page 57
Chapter 4
System Design
MODE
System provides two modes of summary extraction.
Sorted
Unsorted
Auto-Text Summarizer
Page 58
Chapter 4
System Design
WORDS REPLACED
When word replacement parser finds any word in summary which is also present in dictionary, it
replaces the word with its synonym and displays the original word in round brackets to increase
readability.
The word replaced is also shown in a side panel under the heading Words
Replaced giving information about which word is replaced and with what. The color of a
synonym in a summary is also changed to achieve goal of readability.
Auto-Text Summarizer
Page 59
Chapter 4
System Design
RELATED CONCEPTS
Related concepts are the words which gives user an idea of topics discussed in a content. They
are separated for user convenience. They appear on the left side in a panel below the heading
Related Concepts.
Auto-Text Summarizer
Page 60
Chapter 4
System Design
PATH
A panel on an interface displays the path of file browsed. If a summary of web page is required it
displays the URL of that web page and shows User Text when Summarize User Text is
selected. When no file is selected, it shows No Document telling user that no document has
been given as input to summarize.
Auto-Text Summarizer
Page 61
Chapter 4
System Design
EXPORT SUMMARY
Auto-Text Summarizer
Page 62
Chapter 4
System Design
When user is done with generating summary, he/she may want to save summary in his/her
system for future use. The auto-text summarizer provide user with a facility of saving summary.
In the menu bar, click File, go to Export Summary option. The system provide user with two
options here. Either user can save summary as a word document or as an adobe document. A
dialog box will appear asking user to give name to file and select the path where to save a
summary document. The option is disabled until summary is generated.
EDIT SUMMARY
Auto-Text Summarizer
Page 63
Chapter 4
System Design
In a menu bar, an option Edit is used to enable or disable editing in summary. It is enabled when
the summary is generated and the summary tab is selected.
REFRENCES
Auto-Text Summarizer
Page 64
Chapter 4
System Design
REFERENCES
[1] Moawad I.F, Aref M. Semantic graph reduction approach for abstractive text
summarization Inf.
2013<http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=6408498&sortType
%3Ddesc_p_Publication_Year%26queryText%3DSemantic+Summary>
[2] Makbule Gulcin Ozsoy, Ilyas Cicekli, Ferda Nur Alpaslan. Text Summarization of Turkish
Text using Latent Semantic Analysis 23rd International Conference on Computational
Linguistics
(Coling
2010),
Beijing.
August
2010<
http://dl.acm.org/citation.cfm?
id=1873781.1873879&coll=DL&dl=GUIDE&CFID=252479036&CFTOKEN=32208089 >
[3] http://en.wikipedia.org/wiki/Microsoft_Visual_Studio
[4] http://www.cs.utexas.edu/~mooney/cs388/
[5] Hebb, Nicholas. what is a flow chart. Flowbreeze.web. 6 july 2013
[6] Inderjeet Mani. Automatic Summarization. (John Benjamins Publishing Company: The
Netherlands), 2001.
Auto-Text Summarizer
Page 65
Chapter 4
System Design
[7] http://www.copernic.com/en/products/summarizer/
[8] http://en.wikipedia.org/wiki/Automatic_summarization
[9] Dipanjas Das, Andre F.T. Martins. A Survey on Auto-Text Summarization November 21,
2007
[10] http://www.csc.kth.se/~xmartin/papers/licthesis_xmartin_notrims.pdf
[11] https://www.classle.net/projects/project_ideas/development-auto-summarization-tool
[12] Laura Alonso, Irene Castellon, Salvador Climent, Maria Fuentes, Lluis Padro, Horacio
Rodriguez.
Comparative
Study
of
Automated
Text
Summarization<
https://www.google.com.pk/url?
sa=f&rct=j&url=http://cv.uoc.es/~grc0_001091_web/files/ComparisonOfSummarisationSystems
.pdf&q=&esrc=s&ei=uc9WUpOGsKt4AShjoGQCQ&usg=AFQjCNEFHnoPGbgFEYLf9ltDNrek00Erw>
[13] Klaus Zechner. A Literature Survey on Information Extraction and Text Summarization,
14 April 1997< https://www.google.com.pk/#q=infoextr.pdf >
[14] Daniel McDonald, Hsinchun Chen. Using Sentence Selection Heuristics to Rank Text
Segments
in
TXTRACTOR<
https://www.google.com.pk/#q=using_sentence_selection_heuristics_to_rank_text_segments_in_
txtractor.pdf >
[15] http://summarizer.intellexer.com/
[16] http://www.youtube.com/watch?v=TD8kidi-3fU
Auto-Text Summarizer
Page 66