Dwanshi Shukla

A
PROJECT REPORT
ON
TRUTH DISCOVERY WITH MULTIPLE

CONFLICTING INFORMATION PROVIDERS ON THE
WEB
SUBMITTED
TO
CHATRAPATI SHAHU JI MAHARAJ UNIVERSITY KANPUR
in partial fulfilment of the requirements for the award of the
degree of
BACHELOR OF SCIENCE
IN
COMPUTER SCIENCE
SUBMITTED
BY
DEWANSHI SHUKLA
UNDER THE GUIDANCE OF

Mr. Vinay Kumar Shukla
SHRI GULAB SINGH MAHAVIDYALAYA AURAIYA

ACKNOWLEDGEMENT
We express my special thanks to Mr. Vinay Kumar Shukla Guidance, for his
valuable guidance and supervision and constructive suggestions to complete this
project.
We express my heartfelt thanks to my parents and my family members, who

gave moral support in completion of the course
We express my heartfelt thanks and gratitude to all professors, lab

coordinators, non teaching staff and who have help me understanding, encouragement
and support made this effort worth while and possible.
Dewanshi Shukla
DECLARATION
I here by declare that this project report entitled “Truth Discovery with
multiple conflictinginformation Providers on the Web” is the work done by
Dewanshi Shukla towards the partial fulfilment of the requirement for the award of
the degree of B.Sc. Final Year in DEPARTMENT and submitted to Shri Gulab
Singh Mahadyalaya Auraiya, is the result of the work carried out under the
guidance of Mr. Vinay Kumar Shukla.
I further declare that this project report has not been previously submitted
before either in part or full for the award of any degree or any diploma by any
organization or any university
Dewanshi Shukla
B.Sc. 3rd Year
CERTIFICATE
This is to certify that the project entitled” Truth Discovery with multiple
conflicting information providers n the web” is submitted by DEWANSHI SHUKLA
in partial fulfillment of the requirements for the award of Bachelor of Science in
COMPUTER SCIENCE, Shri Gulab Singh Mahavidyalya Auraiya.
Internal Project Guide Head of the department
External examiner
CONTENTS
SNO TITTLE PG.NO.
ABSTRACT i
NOTATIONS ii
ABBREVATIONS iii
LIST OF DIAGRAMS/LIST OF FIGURES iv
LIST OF TABLES v
1. COMPANY PROFILE 1
2. INTRODUCTION 5
2.1 Problem Definition 7
2.2 Objective of Project 7
2.3 Existing System 8
2.4 Disadvantages Of Existing System 8
2.5 Proposed System 8
2.6 Advantages Of Proposed System 8
3. LITERATURE SURVEY 9
4. SYSTEM REQUIREMENTS SPECICATION 11
4.1. Hardware Requirements 11
4.2 Software Requirements 11
5. SYSTEM ANALYSIS 12
5.1 Introduction 12
5.2 Feasibility Study 14
5.2.1. Economical Feasibility Study 14
5.2.2 Technical Feasibility Study 15
5.2.3 Operational Feasibility Study 16
6. SYSTEM DESIGN 17
6.1 Introduction 17
6.2 CLD Diagram 20
6.3 DFD/UML/ER Diagram 20
6.4 Tables/DD 29
7. LANGUAGE SPECIFICATION 32
8. IMPLIMENTATION 43
8.1 Screens Design/Forms Design 43
8.2 Source Code 44
8.3 Output Screens /Reports 53
9. TESTING AND VALIDATION 68
9.1 Introduction 68
9.2 Test Cases 70
10. CONCLUSION 72
10.1 Scope for Future Enhancement 72
11. REFERENCES 73
ABSTRACT
The world-wide web has become the most important information source for
most of us. Unfortunately, there is no guarantee for the correctness of information on
the web. Moreover, different web sites often provide conflicting in-formation on a
subject, such as different specifications for the same product. In this paper we propose
a new problem called Veracity that is conformity to truth, which studies how to find
true facts from a large amount of conflicting information on many subjects that is
provided by various web sites. We design a general framework for the Veracity
problem, and invent an algorithm called Truth Finder, which utilizes the relationships
between web sites and their information, i.e., a web site is trustworthy if it provides
many pieces of true information, and a piece of information is likely to be true if it is
provided by many trustworthy web sites. Our experiments show that Truth Finder
successfully finds true facts among conflicting information, and identifies trustworthy
web sites better than the popular search engines.
ii
NOTATIONS
iii
ABBREVIATIONS
API Application Programming Interface

UML Unified Modeling Language
GUI Graphical User Interface
SDLC Software Development Life Cycle
J2EE Java Enterprise Edition
VGA Video Graphic Adapter
iv
LIST OF FIGURES
S.No Fig.No Figure Name Page No.

1 1.1 Academic project Training plan 2
2 6.1a Module diagram 19
3 6.1b System Architecture 19
4 6.2 Context level diagram 20
5 6.3a DFD 21
6 6.3b first level DFD 21
7 6.3c first level DFD 22
8 6.3d first level DFD 22
9 6.3e Use case diagram 23
10 6.3f Class diagram 24
11 6.3g Sequence diagram 25
12 6.3h Collaboration diagram 26
13 6.3i State chart diagram 27
14 6.3j Activity diagram 28
15 7.1 Description of compiler and interpreter 33
16 8.1 Forms Design 43
v
LIST OF TABLES
S.no Table.no Table name Page.No

1 6.4.1 Table dbo.factdb 30
2 6.4.2 Table dbo.Maindb 30
3 6.4.3 Table dbo.PRA 30
4 6.4.4 Table dbo.support 31
5 6.4.5 Table dbo.user table 31
vi
CHAPTER 1
COMPANY PROFILE
1.COMPANY PROFILE
OVERVIEW1
The Premier and Prestigious Training Division of THE DAWN TECHNO

SOLUTIONShas been involved in providing quality training to students, fresh
graduates, employees, professionals sponsored by various corporate houses. The focus
is on providing quality IT education comparable to standards set by educational
institutions anywhere in the world.
The Dawn team consists of energetic, dynamic and laborious professionals,

each one trained in their own specific field. They will make the coming out
professionals aware of the high-end technologies and involve them in exclusive
project training. The Dawn also provides behavioral, Spoken English and Soft skills
training to help in all-rounded, high-paced career growth.
We try to build a long-term relationship with all our Students who get trained
with us by ensuring timely delivery of quality software trainingand services using
continuously improving processes.
MISSION
'Our mission is To be a Global Institute aimed at imparting industry specific,

academically structured, rigorous education, focusing on building competency for
enabling, measuring, and sustaining excellence in software profession, for corporate
and individuals in it sector'.
1
VISION
'To become the primary resource centre for providing IT training for students
and corporates'.
ACADEMIC PROJECT TRAINING PLAN
VIVA conducted by 3rd Year Academic

Project Training
2 Hod’s & 1 real timer
4th year project 3rd Year Project

implementation & Implementation & Practice
Practice seminars seminars
4th Year Project hands on,

Real Time Project Training
certification
Fig 1.1 Academic project training plan
2
CORPORATE TRAINING
The Dawn offers you corporate training solutions and services that not only
help you prepare your workforce for unparalleled growth, but also accelerate your
organization's competency.
The Dawn Corporate Training programs help organizations identify, invent,

customize, and implement technology training solutions for the modern corporate
environment, thus allowing organizations to work with a single point of contact for all
their staff training needs. Our technical strengths, through an array of training
programs offered therein, enable us to provide our corporate clients with the most
comprehensive and cost effective training solutions. Well equipped to perform a
thorough and accurate Training Needs Analysis, our Corporate Training Solutions
division suggests and conducts training programs that are most appropriate and
deliver optimum value to organizations.
The Dawn has expanded the scope of services to meet the growing demand for
new skill sets in a rapidly changing business environment.
TECHNOLOGIES
The growing need for “industry ready” personnel and for continuous learning in
industry has led The Dawn Techno Solutions to establish a training academy in
Hyderabad. The Dawn Techno Solutions is focusing on the following Courses, which
demand the highest levels of teaching and training excellence.
The Courses include:
 C/C++
 Microsoft Office tools
 JAVA/J2EE
 .Net Technologies
 Oracle (SQL, PL/SQL)
 Testing Tools
 Data Warehousing
3
 Oracle Applications
 IBM Mainframes
 Spoken English
The Dawn Techno Solutions has a clearly articulated training strategy that includes
 Limited persons per batch for individual attention

 24 hrs lab with internet facility
 well experienced corporate trainers as faculties
 On-the-job training
 Resume preparation & Placement Assistance
PLACEMENTS
The Dawn Techno Solutions also offer placement services and help its
students get the right jobs in right time. Our Institute is actively liaisoning with
industries of repute and keeps the students informed of various job opportunities and
also provides guidance to the students to prepare for the interviews. Our efforts are to
ensure that the brightest candidates be picked up by the top notch companies.
Placement is arranged on successful course completion. These are done both at

entry level as well as at Senior Positions depending on the prior background of the
individuals.
4
CHAPTER 2
INTRODUCTION
2.INTRODUCTION
The World Wide Web has become a necessary part of our lives and might
have become the most important information source for most people. Everyday,
people retrieve all kinds of information from the Web. For example, when shopping
online, people find product specifications from websites like Amazon.com or
ShopZilla.com. When looking for interesting DVDs, they get information and read
movie reviews on websites such as NetFlix.com or IMDB.com. When they want to
know the answer to a certain question, they go to Ask.com or Google.com. “Is the
World Wide Web always trustable?” Unfortunately, the answer is “no.” There is no
guarantee for the correctness of information on the Web. Even worse, different
websites often provide conflicting information, as shown in the following examples.
Example (Height of Mount Everest). Suppose a user is interested in how high
Mount Everest is and queries Ask.com with “What is the height of Mount Everest?”
Among the top 20 results, 1 he or she will find the following facts: four websites
(including Ask.com itself) say 29,035 feet, five websites say 29,028 feet, one says29,
002 feet, and another one says 29,017 feet. Which answer should the user trust?
Example (Authors of books). We tried to find out who wrote the book Rapid
Contextual Design (ISBN: 0123540518). We found many different sets of authors
from different online bookstores, and we show several of them in Table 1. From the
image of the book cover, we found that A1 Books provides the most accurate
information. In comparison, the information from Powell’s books is incomplete, and
that from Lakeside books is incorrect.
The trustworthiness problem of the Web has been realized by today’s Internet
users. According to a survey on the credibility of websites conducted by Princeton
Survey Research in 2005, 54 percent of Internet users trust news websites at least
most of time, while this ratio is only 26 percent for websites that offer products for
sale and is merely 12 percent for blogs. There have been many studies on ranking web
pages according to authority (or popularity) based on hyperlinks. The most influential
studies are Authority-Hub analysis, and Page Rank, which lead to Google.com.
However, does authority lead to accuracy of information? The answer is unfortunately
no. Top-ranked websites are usually the most popular ones. However, popularity does
5
not mean accuracy. For example, according to our experiments, the bookstores ranked
on top by Google (Barnes & Noble and Powell’s books) contain many errors on book
author information. In comparison, some small bookstores (e.g., A1 Books) provide
more accurate information.
In this project, we propose a new problem called the Veracity problem, which
is formulated as follows: Given a large amount of conflicting information about many
objects, which is provided by multiple websites (or other types of information
providers), how can we discover the true fact about each object? We use the word
“fact” to represent something that is claimed as a fact by some website, and such a
fact can be either true or false. In this paper, we only study the facts that are either
properties of objects (e.g., weights of laptop computers) or relationships between two
objects (e.g., authors of books). We also require that the facts can be parsed from the
web pages. There are often conflicting facts on the Web, such as different sets of
authors for a book. There are also many websites, some of which are more
trustworthy than others.2 A fact is likely to be true if it is provided by trustworthy
websites (especially if by many of them).
A website is trustworthy if most facts it provides are true. Because of this
interdependency between facts and websites, we choose an iterative computational
method. At each iteration, the probabilities of facts being true and the trustworthiness
of websites is inferred from each other. This iterative procedure is rather different
from Authority-Hub analysis. Thus, we cannot compute the trustworthiness of a
website by adding up the weights of its facts as in, nor can we compute the probability
of a fact being true by adding up the trustworthiness of websites providing it. Instead,
we have to resort to probabilistic computation. Second and more importantly,
different facts influence each other. For example, if a website says that a book is
written by “Jessamyn Wendell” and another says “Jessamyn Burns Wendell,” then
these two websites actually support each other although they provide slightly different
facts. We incorporate such influences between facts into our computational model.
In summary, we make three major distributions in this paper. First, we

formulate the Veracity problem about how to discover true facts from conflicting
information. Second, we propose a framework to solve this problem, by defining the
trustworthiness of websites, confidence of facts, and influences between facts. Finally,
we propose an algorithm called TRUTHFINDER for identifying true facts using
6
iterative methods. Our experiments show that TRUTHFINDER achieves very high
accuracy in discovering true facts, and it can select better trustworthy websites than
authority-based search engines such as Google.
2.1 PROBLEM DEFINITION:
The world-wide web has become the most important information source for most of
us. Unfortunately, there is no guarantee for the correctness of information on the web.
Moreover, different web sites often provide conflicting in-formation on a subject,
such as different specifications for the same product. In this paper we propose a new
problem called Veracity that is conformity to truth, which studies how to find true
facts from a large amount of conflicting information on many subjects that is provided
by various web sites. We design a general framework for the Veracity problem, and
invent an algorithm called Truth Finder, which utilizes the relationships between web
sites and their information, i.e., a web site is trustworthy if it provides many pieces of
true information, and a piece of information is likely to betrue if it is provided by
many trustworthy web sites. Our experiments show that Truth Finder successfully
finds true facts among conflicting information, and identifies trustworthy web sites
better than the popular search engines.
2.2 OBJECTIVE OF PROJECT
The world-wide web has become the most important information source for
most of us. Unfortunately, there is no guarantee for the corectness of information on
the web. Moreover, different web sites often provide conflicting in-formation on a
subject, such as different specifications for the same product. In this project we
propose a new problem called Veracity. We design a general framework for the
Veracity problem, and invent an algorithm called Truth Finder.
2.3 EXISTING SYSTEM:
Page Rank and Authority-Hub analysis is to utilize the hyperlinks to find

pages with high authorities.These two approaches identifying important web pages
that users are interested in, Unfortunately, the popularity of web pages does not lead
to accuracy of information
7
2.4 DISADVANTAGES OF EXISTING SYSTEM:
The popularity of web pages does not necessarily lead to accuracy of
information.Even the most popular website may contain many errors. Where as some
comparatively not-so-popular websites may provide more accurate information
.
2.5 PROPOSED SYSTEM:
We formulate the Veracity problem about how to discover true facts from
conflicting information.Second, we propose a framework to solve this problem, by
defining the trustworthiness of websites, confidence of facts, and influences between
facts. Finally, we propose an algorithm called TRUTHFINDER for identifying true
facts using iterative methods.
2.6 ADVANTAGES OF PROPOSED SYSTEM:

Our experiments show that TRUTHFINDER achieves very high accuracy in
discovering true facts. It can select better trustworthy websites than authority-based
search engines such as Google.
8
CHAPTER 3
LITERATURE
SURVEY
3.LITERATURE SURVEY
DATA QUALITY
Data quality is the quality of data. Data are of high quality "if they are fit for
their intended uses in operations, decision making and planning" (J. M.
Juran). Alternatively, the data are deemed of high quality if they correctly
represent the real-world construct to which they refer. These two views can
often be in disagreement, even about the same set of data used for the same
purpose.
Before the rise of the inexpensive server, massive mainframe computers were
used to maintain name and address data so that the mail could be properly
routed to its destination. The mainframes used business rules to correct
common misspellings and typographical errors in name and address data, as
well as to track customers who had moved, died, gone to prison, married,
divorced, or experienced other life-changing events. Government agencies
began to make postal data available to a few service companies to cross-
reference customer data with the National Change of Address registry
(NCOA). This technology saved large companies millions of dollars
compared to manually correcting customer data. Large companies saved on
postage, as bills and direct marketing materials made their way to the
intended customer more accurately. Initially sold as a service, data quality
moved inside the walls of corporations, as low-cost and powerful server
technology became available.
Companies with an emphasis on marketing often focus their quality efforts on

name and address information, but data quality is recognized as an important
property of all types of data. Principles of data quality can be applied to
supply chain data, transactional data, and nearly every other category of data
found in the enterprise. For example, making supply chain data conform to a
9
certain standard has value to an organization by: 1) avoiding overstocking of
similar but slightly different stock; 2) improving the understanding of vendor
purchases to negotiate volume discounts; and 3) avoiding logistics costs in
stocking and shipping parts across a large organization.
While name and address data has a clear standard as defined by local postal
authorities, other types of data have few recognized standards. There is a
movement in the industry today to standardize certain non-address data. The
non-profit group GS1 is among the groups spearheading this movement.
For companies with significant research efforts, data quality can include
developing protocols for research methods, reducing measurement error,
bounds checking of the data, cross tabulation, modeling and outlier detection,
verifying data integrity, etc.
10
CHAPTER 4
SYSTEM REQUIRMENTS
SPECIFICATION
4. SYSTEM REQUIREMENTS SPECIFICATION
4.1 HARDWARE REQUIREMENTS
PROCESSOR : PENTIUM IV 2.6 GHz

RAM : 512 MB DD RAM
HARD DISK : 20 GB
4.2 SOFTWARE REQUIREMENTS
FRONT END : Java

OPERATING SYSTEM : Windows XP
WEB SERVER : Apache Tomcat
BACK END : Sql Server 2005
11
CHAPTER 5
SYSTEM ANALYSIS
5.SYSTEM ANALYSIS
5.1 INTRODUCTION
REQURIEMENTS ANALYSIS
The requirement phase basically consists of three activities:
 Requirement Analysis
 Requirement Specification
 Requirement Validation
REQUIREMENT ANALYSIS:
Requirement Analysis is a software engineering task that bridges the gap

between system level software allocation and software design. It provides the system
engineer to specify software function and performance, indicate software’s interface
with the other system elements and establish constraints that software must meet.
The basic aim of this stage is to obtain a clear picture of the needs and
requirements of the end-user and also the organization. Analysis involves interaction
between the clients and the analysis. Usually analysts research a problem by asking
questions and reading existing documents. The analysts have to uncover the real
needs of the user even if they don’t know them clearly. During analysis it is essential
that a complete and consistent set of specifications emerge for the system. Here it is
essential to resolve the contradictions that could emerge from information got from
various parties. This is essential to ensure that the final specifications are consistent.
It may be divided into 5 areas of effort.
 Problem recognition
 Evaluation and synthesis
 Modeling
 Specification
 Review
12
Each Requirement analysis method has a unique point of view. However all
analysis methods are related by a set of operational principles.
They are
 The information domain of the problem must be represented and

understood.
 The functions that the software is to perform must be defined.
 The behavior of the software as a consequence of external events must
be defined.
 The models that depict information, function and behavior must be
partitioned in a hierarchical or layered fashion.
 The analysis process must move from essential information to
Implementation detail
REQUIREMENT ANALYSIS IN THIS PROJECT
The main aim in this stage is to assess what kind of a system would be suitable
for a problem and how to build it. The requirements of this system can be defined by
going through the existing system and its problems. They discussing (speak) about
the new system to be built and their expectations from it. The steps involved would
be
PROBLEM RECOGNITION:
The main problem is here while taking the appointments for the Doctors. If we
want to verify the old data or historical data it is very difficult to findout. Maintain the
data related to all departments is very difficult.
13
EVALUATION AND SYNTHESIS:
In the proposed system this application saves the lot of time, and it is time
saving process when we use this application. Using this application we can easy to
manage daily treatments and easy to maintain the historical data. No specific training
is required for the employees to use this application. They can easily use the tool that
decreases manual hours spending for normal things and hence increases the
performance.
5.2 FEASIBILITY STUDY:
All projects are feasible given unlimited resources and infinite time. But the
development of software is plagued by the scarcity of resources and difficult delivery
rates. It is both necessary and prudent to evaluate the feasibility of a project at the
earliest possible time.
Three key considerations are involved in the feasibility analysis
5.2.1 ECONOMIC FEASIBILITY:
This procedure is to determine the benefits and savings that are expected from
a candidate system and compare them with costs. If benefits outweigh costs, then the
decision is made to design and implement the system. Otherwise, further justification
or alterations in proposed system will have to be made if it is to have a chance of
being approved. This is an ongoing effort that improves in accuracy at each phase of
the system life cycle.
14
FINANCIAL FEASIBILITY
I) TIME BASED:Contrast to the manual system management can generate any report
just by single click. In manual system it is too difficult to maintain historical data
which become easier in this system. Time consumed to add new records or to view
the reports is very less compared to manual system. So this project is feasible in this
point of view
II) COST BASED: No special investment need to manage the tool. No specific
training is required for employees to use the tool. Investment requires only once at the
time of installation. The software used in this project is freeware so the cost of
developing the tool is minimal and hence the overall cost.
5.2.2 TECHNICAL FEASIBILITY:
Technical feasibility centers on the existing computer system (hardware,

software, etc.,) and to what extent it can support the proposed addition. If the budget
is a serious constraint, then the project is judged not feasible.In this project the system
is self-explanatory and does not need any extra sophisticated training. As the system
has been built by concentrating on the Graphical User Interface Concepts, the
application can also be handled very easily with a novice User. The overall time that
is required to train the users upon the system is less than half an hour.
The System has been added with features of menu-driven and button
interaction methods, which makes the user the master as he starts working through the
environment. The net time the customer should concentrate is on the installation time.
15
5.2.3 OPERATIONAL FEASIBILITY:
People are inherently resistant to change, and computers have been known to
facilitate change. It is understandable that the introduction of a candidate system
requires special effort to educate, sell, and train the staff on new ways of conducting
business.
16
CHAPTER 6
SYSTEM DESIGN
6.SYSTEM DESIGN
6.1 INTRODUCTION:
The most creative and challenging phase of the life cycle is system design.
The term design describes a final system and the process by which it is developed. It
refers to the technical specifications that will be applied in implementations the
candidate system. The design may be defined as “the process of applying various
techniques and principles for the purpose of defining a device, a process or a system
in sufficient details to permit its physical realization”.
The designer’s goal is how the output is to be produced and in what format
samples of the output and input are also presented. Second input data and database
files have to be designed to meet the requirements of the proposed output. The
processing phases are handled through the program Construction and Testing.
Finally, details related to justification of the system and an estimate of the impact of
the candidate system on the user and the organization are documented and evaluated
by management as a step toward implementation.
The importance of software design can be stated in a single word “Quality”.

Design provides us with representations of software that can be assessed for quality.
Design is the only way that we can accurately translate a customer’s requirements into
a finished software product or system without design we risk building an unstable
system, that might fail it small changes are made or may be difficult to test, or one
who’s quality can’t be tested. So it is an essential phase in the development of a
software product.
17
MODULE DESCRIPTION
MODULES :
 Collection of unrelated data
 Data search
 Truth Finder algorithm
 Result calculation
MODULE DESCRIPTION
COLLECTION OF DATA
First we have to collect the specific data about an object and it is stored in
related database. Create table for specific object and store the facts about a particular
object.
DATA SEARCH
Searching the related data link according to user input. In this module user
retrieve the specific data about an object.
TRUTH ALGORITHM
We design a general framework for the Veracity problem, and invent an algorithm
called Truth Finder, which utilizes the relationships between web sites and their
information, i.e., a web site is trustworthy if it provides many pieces of true
information, and a piece of information is likely to be true if it is provided by many
trustworthy web sites.
RESULT CALCULATION
For each response of the query we are calculating the Performance. Using the
count calculated find the best link and show as the out put.
IN MODULE GIVEN INPUT AND EXPECTED OUTPUT

COLLECTION OF DATASET:
Given input : Result set (collection of data)

Expected output: separating and grouping relevant data about a particular object
18
Module diagram:
Result set Specific data

Fig 6.1a Module diagram
TRUTH FINDER ALGORITHM
 Each object has a set of conflictive facts

 E.g., different author names for a book
 And each web site provides some facts
 How to find the true fact for each object
SYSTEM ARCHITECTURE:
Truth Discovery with multiple conflicting information providers on the web
Home
Login Validation Login
Query Process
Search Engine Conflicting Web

Pages
Truth Finder Truth Finder

Webpage’s
Output
Fig 6.1b System Architecture
19
6.2 CONTEXT LEVEL DIAGRAM:
0.0
User TruthFinder User
System
fig 6.2 Context Diagram
6.3 DATA FLOW DIAGRAMS AND UML DIAGRAMS
DFD’s is used model system components.DFD shows how the information

moves through the system and hw it is modified by a series of transformation.It is a
graphical technique that represents data flow and those transformations
20
Web sites Facts Objects
W1 f1
o1
W2 f2
W3 f3
o2
W4 f4
Fig 6.3a DFD
FIRST LEVEL DFD
Store Info on DB
Validate
User
Enter uname
password Valid user
User Login Fact info
correct site
Fig 6.3b First level DFD
21
Search For Gives Many
1.0
information Sites
Query For a site Truth Finder Query For a site
System
Fig 6.3c First level DFD
DB Info
Many sites
1.1
Displays Selects Best Site
Best Site Truth Finder Best Site
System
Fig 6.3dFirst level DFD
UML DIAGRAMS:
INTRODUCTION:
UML is a notation that resulted from the unification of Object

Modeling Technique and Object Oriented Software Technology .UML has been
designed for broad range of application. Hence, it provides constructs for a broad
range of systems and activities.
22
AN OVERVIEW OF UML WITH FIVE DIAGRAMS
1. USE CASE DIAGRAMS
Use cases are used during requirements elicitation and analysis to

represent the functionality of the system. Use cases focus on the behavior of the
system from the external point of view. The actor are Outside the boundary of the
system, whereas the use cases are inside the boundary of the system.
truth discovery
(from Logical View)
home
user
search
truthfinder
display the details
Fig 6.3e Use case diagram
23
2. CLASS DIAGRAMS
Class diagrams to describe the structure of the system. Classes are abstraction that
specify the common structure and behavior of a set Class diagrams describe the
system in terms of objects, classes, attributes, operations and their associations.
Fig 6.3f Class Diagram
24
3. SEQUENCE DIAGRAMS
Sequence diagrams are used to formalize the behavior of the system and to visualize
the communication among objects. They are useful for identifying additional objects that
participate in the use cases. A Sequence diagram represents the interaction that take place
among these objects.
user home page Search Truth Finder Output
enters
login details
enter uid
invalid
invalid
valid
Query
conflicting information true facts

true facts
Fig 6.3g Sequence diagram
25
4.COLLABORATION DIAGRAM:
A collaboration diagram emphasisies the organization of objects that participate in an

interaction
Fig 6.3h Collaboration diagram
26
5. STATECHART DIAGRAMS
State chart diagrams describe the behavior of an individual object as a

number of states and transitions between these states. A state represents a particular set of
values for an object. The sequence diagram focuses on the messages exchanged between
objects, the state chart diagrams focuses on the transition between states.
Home
Login
Query Process
Database
Conflicting
Information
Truthfinder
Result
Fig 6.3i State chart diagram
27
6. ACTIVITY DIAGRAMS
An activity diagram describes a system in terms of activities. Activities are

states that represents the execution of a set of operations. Activity diagrams are similar to
flowchart diagram and data flow.
Fig 6.3j Activity diagram
28
6.4 TABLES IN MS SQL SERVER:
29
Table 6.4.1 dbo.factdb
Table 6.4.2 dbo.maindb
Table 6.4.3 dbo.PRA
30
Table 6.4.4 dbo.support
Table 6.4.5 dbo.usertable
31
CHAPTER 7
LANGUAGE
SPECIFICATION
7.LANGUAGE SPECIFICATION
TECHNOLOGIES FEATURES:
JAVA
Java has two things: a programming language and a platform.Java is a high-

level programming language that is all of the following:
Simple
Architecture-neutral
Object-oriented
Portable
Secure
Distributed
Interpreted
Robust
Java is also unusual in that each Java program is both compiled and
interpreted. With a compile you translate a Java program into an
intermediate language called Java byte codes the platform-independent
code instruction is passed and run on the computer.
Compilation happens just once; interpretation occurs each time the

program is executed. The figure illustrates how this works.
Java Program Interpreter
Compilers My Program
32
fig 7.1 des of compiler and interpreter
You can think of Java byte codes as the machine code instructions for
the Java Virtual Machine (Java VM). Every Java interpreter, whether it’s a
Java development tool or a Web browser that can run Java applets, is an
implementation of the Java VM. The Java VM can also be implemented in
hardware.
Java byte codes help make “write once, run anywhere” possible. You
can compile your Java program into byte codes on my platform that has a
Java compiler. The byte codes can then be run any implementation of the
Java VM. For example, the same Java program can run Windows NT,
Solaris, and Macintosh.
JAVA PLATFORM
The Java platform has two components:
The Java Virtual Machine (Java VM)
The Java Application Programming Interface (Java

API)
You’ve already been introduced to the Java VM. It’s the base for the
Java platform and is ported onto various hardware-based platforms. The Java
API is a large collection of ready-made software components that provide
many useful capabilities, such as graphical user interface (GUI) widgets.
The Java API is grouped into libraries (package) of related

components. The next sections, what can Java do? Highlights each area of
functionally provided by the package in the Java API.
How does the Java API support all of these kinds of programs? With
packages of software components that provide a wide range of functionality.
The API is the API included in every full implementation of the
platform.The core API gives you the following features:The Essentials:
Objects, Strings, threads, numbers, input and output, data structures, system
properties, date and time, and so on.
Applets: The set of conventions used by Java applets.

33
Networking: URL’s TCP and UDP sockets and IP addresses.
Internationalization: Help for writing programs that can be localized for

users.
Worldwide programs can automatically adapt to specific locates and be

displayed in the appropriate language.
Java Program
 Java API
 Java Virtual Machine
 Java Program
 Hard Ware
API and Virtual Machine insulates the Java program from hardware
dependencies. As a platform-independent environment, Java can be a bit
slower than native code. However, smart compilers, well-tuned interpreters,
and Just-in-time-byte-code compilers can bring Java’s performance close to
the native code without threatening portability.
However, Java is not just for writing cut, entertaining applets for the
World Wide Web (WWW). Java is a general purpose, high-level
programming language and a powerful software platform. Using the fineries
Java API, you can write many types of programs.
Internet addresses
In order to use a service, you must be able to find it. The Internet uses an
address scheme for machines so that they can be located. The address is a 32 bit
integer which gives the IP address. This encodes a network ID and more addressing.
The network ID falls into various classes according to the size of the network address.
34
Network address
Class A uses 8 bits for the network address with 24 bits left over for other
addressing. Class B uses 16 bit network addressing. Class C uses 24 bit network
addressing and class D uses all 32.
Subnet address
Internally, the UNIX network is divided into sub networks. Building 11 is

currently on one sub network and uses 10-bit addressing, allowing 1024 different
hosts.
Host address
8 bits are finally used for host addresses within our subnet. This places a limit
of 256 machines that can be on the subnet.
Port addresses
A service exists on a host, and is identified by its port. This is a 16 bit number.
To send a message to a server, you send it to the port for that service of the host that it
is running on. This is not location transparency! Certain of these ports are "well
known".
Sockets
A socket is a data structure maintained by the system to handle network
connections. A socket is created using the call socket. It returns an integer that is like
a file descriptor.
Server Socket
A Server Socket listens for the Socket request and performs message handling
functions, file sharing, database sharing functions etc.
JDBC
In an effort to set an independent database standard API for Java, Sun
Microsystems developed Java Database Connectivity, or JDBC. JDBC offers a
generic SQL database access mechanism that provides a consistent interface to a
variety of RDBMS. This consistent interface is achieved through the use of “plug-in”
35
database connectivity modules, or drivers. If a database vendor wishes to have JDBC
support, he or she must provide the driver for each platform that the database and Java
run on.
To gain a wider acceptance of JDBC, Sun based JDBC’s framework on
ODBC. As you discovered earlier in this chapter, ODBC has widespread support on a
variety of platforms. Basing JDBC on ODBC will allow vendors to bring JDBC
drivers to market much faster than developing a completely new connectivity
solution.
JDBC Goals
Few software packages are designed without goals in mind. JDBC is one that,
because of its many goals, drove the development of the API. These goals, in
conjunction with early reviewer feedback, have finalized the JDBC class library into a
solid framework for building database applications in Java.
The goals that were set for JDBC are important. They will give you some insight
as to why certain classes and functionalities behave the way they do. The eight design
goals for JDBC are as follows:
1. SQL Level API

The designers felt that their main goal was to define a SQL interface for Java.
Although not the lowest database interface level possible, it is at a low enough
level for higher-level tools and APIs to be created. Conversely, it is at a high
enough level for application programmers to use it confidently. Attaining this goal
allows for future tool vendors to “generate” JDBC code and to hide many of
JDBC’s complexities from the end user.
2. SQL Conformance
SQL syntax varies as you move from database vendor to database vendor. In
an effort to support a wide variety of vendors, JDBC will allow any query
statement to be passed through it to the underlying database driver. This allows
the connectivity module to handle non-standard functionality in a manner that is
suitable for its users.
3. JDBC must be implemental on top of common database interfaces
36
The JDBC SQL API must “sit” on top of other common SQL level APIs. This
goal allows JDBC to use existing ODBC level drivers by the use of a software
interface. This interface would translate JDBC calls to ODBC and vice versa.
4. Provide a Java interface that is consistent with the rest of the Java system
Because of Java’s acceptance in the user community thus far, the designers
feel that they should not stray from the current design of the core Java system.
5.Keep it simple
This goal probably appears in all software design goal listings. JDBC is no
exception. Sun felt that the design of JDBC should be very simple, allowing for
only one method of completing a task per mechanism. Allowing duplicate
functionality only serves to confuse the users of the API.
About JSP:
JavaServer Pages (JSP) technology enables you to mix regular, static HTML
with dynamically generated content from servlets. You simply write the regular
HTML in the normal manner, using familiar Web-page-building tools. You then
enclose the code for the dynamic parts in special tags, most of which start with <%
and end with %>. For example, here is a section of a JSP page that results in “Thanks
for ordering
Core Web Programming” for a URL of http://host/OrderConfirmation.
jsp?title=Core+Web+Programming:
Thanks for ordering <I><%= request.getParameter("title") %></I> Separating the
static HTML from the dynamic content provides a number of benefits over servlets
alone, and the approach used in JavaServer Pages offers several advantages over
competing technologies such as ASP, PHP, or ColdFusion. Section 1.4 (The
Advantages of JSP) gives some details on these advantages, but they basically boil
down to two facts: that JSP is widely supported and thus doesn’t lock you into a
particular operating system or Web server and that JSP gives you full access to servlet
and Java technology for the dynamic part, rather than requiring you to use an
unfamiliar and weaker special- purpose language. The process of making JavaServer
Pages accessible on the Web is much simpler than that for servlets. Assuming you
have a Web server that supports JSP, you give your file a .jsp extension and simply
install it in any place you could put a normal Web page: no compiling, no packages,
37
and no user CLASSPATH settings. However, although your personal environment
doesn’t need any special settings, the server still has to be set up with access to the
servlet and JSP class files and the Java compiler. For details, see your server’s
documentation or Section 1.5 (Installation and Setup).Although what you write often
looks more like a regular HTML file than a servlet, behind the scenes, the JSP page is
automatically converted to a normal servlet, with the static HTML simply being
printed to the output stream associated with the servlet’s service method. This
translation is normally done the first time the page is requested. To ensure that the
first real user doesn’t get a momentary delay when the JSP page is translated into a
servlet and compiled, developers can simply request the page themselves after first
installing it. Many Web servers also let you define aliases so that a URL that appears
to reference an HTML file really points to a servlet or JSP page. Depending on how
your server is set up, you can even look at the source code for servlets generated from
your JSP pages. With Tomcat 3.0, you need to change the isWorkDirPersistent
attribute from false to true in install_dir/server.xml. After that, the code can be found
in install_dir/work/port-number. With the JSWDK 1.0.1, you need to change the
workDirIsPersistent attribute from false to true in install_dir/webserver.xml. After
that, the code can be found in install_dir/work/%3Aport-number%2F. With the Java
Web Server, 2.0 the default setting is to save source code for automatically generated
servlets. They can be found in install_dir/tmpdir/default/pagecompile/jsp/_JSP.
One warning about the automatic translation process is in order. If you make an error
in the dynamic portion of your JSP page, the system may not be able to properly
translate it into a servlet. If your page has such a fatal translation-time error, the server
will present an HTML error page describing the problem to the client. Internet
Explorer 5, however, typically replaces server-generated error messages with a
canned page that it considers friendlier. You will need to turn off this “feature” when
debugging JSP pages. To do so with Internet Explorer 5, go to the Tools menu, select
Internet Options, choose the Advanced tab, and make sure “Show friendly HTTP
error messages” box is not checked.
Aside from the regular HTML, there are three main types of JSP constructs
that you embed in a page: scripting elements, directives, and actions. Scripting
elements let you specify Java code that will become part of the resultant servlet,
directives let you control the overall structure of the servlet, and actions let you
38
specify existing components that should be used and otherwise control the behavior of
the JSP engine. To simplify the scripting elements, you have access to a number of
redefined variables, such as request in the code snippet just shown (see Section 10.5
for more details). Scripting elements are covered in this chapter, and directives and
actions are explained in the following chapters. You can also refer to Appendix
(Servlet and JSP
Quick Reference) for a thumbnail guide summarizing JSP syntax.
JSP changed dramatically from version 0.92 to version 1.0, and although these
changes are very much for the better, you should note that newer JSP pages are almost
totally incompatible with the early 0.92 JSP engines, and older JSP pages are equally
incompatible with 1.0 JSP engines. The changes from version 1.0 to 1.1 are much less
dramatic: the main additions in version 1.1 are the ability to portably define new tags
and the use of the servlet 2.2 specification for the underlying servlets. JSP 1.1 pages
that do not use custom tags or explicitly call 2.2-specific statements are compatible
with JSP 1.0 engines, and JSP 1.0 pages are totally upward compatible with JSP 1.1
engines.
SQL SERVER 2005:
The next release of SQL Server is designed to help enterprises address these
challenges. SQL Server 2005 is Microsoft’s next generation data management and
analysis solution that will deliver increased security, scalability, and availability to
enterprise data and analytical applications while making them easier to create, deploy,
and manage. Building on the strengths of SQL Server 2000, SQL Server 2005
will provide an integrated data management and analysis solution that will help
organizations of any size to:
 Build and deploy enterprise applications that are more secure, scalable, and reliable.
 Maximize the productivity of IT by reducing the complexity of creating, deploying,
and managing database applications.
39
 Empower developers through a rich, flexible, modern development environment for
creating more secure database applications.
 Share data across multiple platforms, applications, and devices to make it easier to
connect internal and external systems.
 Deliver robust, integrated business intelligence solutions that help drive informed
business decisions and increase productivity across your entire organization.
 Control costs without sacrificing performance, availability, or scalability. Read on to
learn more about the advancements SQL Server 2005 will deliver in three key areas:
enterprise data management, developer productivity, and business intelligence.
Enterprise Data Management

In today’s connected world, data and the systems that manage that data must
always be available to your users. With SQL Server 2005, users and IT professionals
across your organization will benefit from reduced application downtime, increased
scalability and performance, and tight security controls.
SQL Server 2005 will include enhancements to enterprise data management in the
following areas:
 Availability. Investments in high availability technologies, additional backup and
restore capabilities, and replication enhancements will enable enterprises to build
and deploy highly reliable applications.
 Scalability. Scalability advancements such as partitioning, snapshot isolation, and
64-bit support will enable you to build and deploy your most demanding
applications using SQL Server 2005.
 Security. Enhancements such as “secure by default” settings and an enhanced
security model will help provide a high level of security for your enterprise data.
 Manageability. A new management tool suite, expanded self-tuning capabilities,
and a powerful new programming model will increase the productivity of database
administrators.
 Interoperability. Through deep support for industry standards, Web services, and
the Microsoft .NET Framework, SQL Server 2005 will support interoperability with
multiple platforms, applications, and devices.
40
Developer Productivity
One of the key barriers to developer productivity has been the lack of
integrated tools for database development and debugging. SQL Server 2005 will
provide advancements that fundamentally change the way that database applications
are developed and deployed.
Enhancements for developer productivity will include:
 Improved tools. Developers will be able to utilize one development tool for
Transact-SQL, XML, Multidimensional Expression (MDX), and XML for Analysis
(XML/A).
 Expanded language support. With the common language runtime (CLR) hosted in
the database engine, developers will be able to choose from a variety of familiar
languages to develop database applications, including Transact-SQL, Microsoft Visual
Basic® .NET, and Microsoft Visual C#.NET.
 XML and Web services. SQL Server 2005 will support both relational and XML data
natively, so enterprises can store, manage, and analyze data in the format that best
suits their needs.
Support for existing and emerging open standards such as Hypertext Transfer
Protocol (HTTP), XML, Simple Object Access Protocol (SOAP), XQuery, and XML
Schema Definition (XSD) will also facilitate communication across extended
enterprise systems.
Business Intelligence
The challenge and promise of business intelligence revolves around providing
employees with the right information, at the right time. Accomplishing this vision
demands a business intelligence solution that is comprehensive, secure, integrated
with operational systems, and available all day, every day. SQL Server will help
companies to achieve this goal with SQL Server 2005.
Business intelligence advancements will include:
 Integrated platform. SQL Server 2005 will deliver an end-to-end business
intelligence platform with integrated analytics including online analytical processing
(OLAP); data mining; extract, transformation, and load (ETL) tools; data
warehousing; and reporting functionality.
41
 Improved decision making. Advancements to existing business intelligence features,
such as OLAP and data mining, and the introduction of a new reporting server will
provide enterprises with the ability to transform information into better business
decisions at all organizational levels.
 Security and availability. Scalability, availability, and security enhancements will
help to provide users with uninterrupted access to business intelligence applications
and reports.
 Enterprise-wide analytical capabilities. An improved ETL tool will enable
organizations to more easily integrate and analyze data from multiple
heterogeneous information sources. By analyzing data across a wide array of
operational systems, organizations may gain a competitive edge through a holistic
understanding of their business.
Additional Information
SQL Server 2005 is part of the Windows Server System– a comprehensive and
integrated server infrastructure that simplifies the development, deployment and
operations of a flexible business solution.
42
CHAPTER 8
IMPLEMENTATION
8. IMPLEMENTATION
8.1 FORMS DESIGN
The development stage takes as its primary input the design elements described in the
approved design document. For each design element, a set of one or more software artifacts
will be produced., Appropriate test cases will be developed for each set of functionally related
software artifacts, and an online help system will be developed to guide users in their
interactions with the software.
fig 8.1 forms design

At this point, the RTM is in its final configuration. The outputs of the development
stage include a fully functional set of software that satisfies the requirements and design
elements previously documented, an online help system describes the test cases to be used to
validate the correctness and completeness of the software, an updated RTM, and an updated
project plan.
43
8.2 SOURCE CODE
Source code for new user
<script language="javascript">
function number(field)
{
var input=field.value;
var input=field.value;
var len=input.length;
var status=true;
for(var i=0;i<len;i++)
{
var chars=input.substring(i,i+1);
if(chars < "0" || chars > "9")
{
status=false;
break;
}
}
if(status==false)
{
alert("Enter the Numeric Input...");
document.form1.field.focus();
//return false;
function validate()
{
44
var user=document.form1.T1.value;
var pass=document.form1.T2.value;
var cate=document.form1.T3.value;
//alert("hai"+user+pass+cate)
if(user.length<1)
{
alert(" Enter The Username....");
document.form1.T1.focus();
return false;
}
else if(pass.length<1)
{
alert(" Enter The Password....");
return false;
}
else if(cate=="Select Type")
{
alert(" Select The User Type....");
return false;
}
return true;
// -->
</script>
45
Source code for index
</style>
<![endif]-->
<script type="text/javascript">

</script>
Source code for search

<script type="text/javascript">
<!--
function newImage(arg) {
rslt = new Image();
rslt.src = arg;
return rslt;
}
}
function forward( )
{
<% String text=request.getParameter("textfield");
String search1=request.getParameter("search");
request.setAttribute("text",text);
request.setAttribute("search",search1);%>
document.location="response.jsp";
}
function forward1( )
47
{
document.location="Bestone.jsp";
}
function changeImages() {
if (document.images && (preloadFlag == true)) {
for (var i=0; i<changeImages.arguments.length; i+=2) {
document[changeImages.arguments[i]].src =
changeImages.arguments[i+1];
}
}
}
var preloadFlag = false;

function preloadImages() {
btn_home_over = newImage("images/btn_home-over.gif");
btn_aboutus_over = newImage("images/btn_aboutus-
over.gif");
btn_contactus_over = newImage("images/btn_contactus-
over.gif");
btn_products_over = newImage("images/btn_products-
over.gif");
btn_services_over = newImage("images/btn_services-
over.gif");
preloadFlag = true;
}
}
Source code for page Ranking

<%@ page import="java.io.*"%>
<%@ page import="java.sql.*"%>
<%@ page import ="java.lang.*"%>
<%
48
Class.forName("sun.jdbc.odbc.JdbcOdbcDriver");
Connection
con=DriverManager.getConnection("jdbc:odbc:truth","sa","");
Statement st=con.createStatement();
String searchword=(String)application.getAttribute("text");
String method=(String)application.getAttribute("method");
String user=(String)session.getAttribute("uname");
//out.println(searchword+"<br>"+method);
ResultSet rs=st.executeQuery("select location from MainDb where
filename='"+searchword+"'");
String location="";
while(rs.next())
{
location=rs.getString(1);
out.println(location);
}
String filename="";
String filenames[]=null;
File path=new File(location);
File files[]=path.listFiles();
if(files!=null)
{
for(int i=0;i<files.length;i++)
{
if(files.length==0)
{
out.println("");
}
else
{
// System.out.println("compiling");
try
{
filename=files[i].toString();
49
names=filename.replace('\\','&')
//filenames=names.split("&");
//out.println(filenames.length+”<br>”);
//Thread.sleep(1000);
out.println(filenames[filenames.length-1]+"<br>");
out.println(filename+"<br>");
}
catch (Exception ee)
{
ee.printStackTrace();
}
}
}
}
else
{
//proxy1.jLabel3.setText("completed.....");
}
%>
SOURCE CODE FOR TRUTH FINDER
<%@ page import="java.io.*"%>

<%@ page import="java.sql.*"%>
<%@ page import ="java.lang.*"%>
<%!
String resultfiles[];
int retrive;
String factlength;
%>
<%
String result="";
String valu=request.getParameter("value");
String searchword="";
factlength=(String)application.getAttribute("fact");
50
int fac=Integer.parseInt(factlength);
int retrive=fac-2;
if(valu.equalsIgnoreCase("redirect"))
{
searchword=(String)application.getAttribute("text");
//out.println(searchword);
}
else
{
searchword=(String)application.getAttribute("text");
}
//String resultfile=request.getParameter("result");
//resultfiles=resultfile.split("$");
Class.forName("sun.jdbc.odbc.JdbcOdbcDriver");
Connection con=DriverManager.getConnection("jdbc:odbc:truth","sa","");
Statement st=con.createStatement();
String method=(String)request.getAttribute("method");
String user=(String)session.getAttribute("uname");
ResultSet rs=st.executeQuery("select location from support where
about='"+searchword+"' order by support desc");
String location="";
while(rs.next())
{
result+=rs.getString(1)+"#";
}
//out.println(result);
/* ResultSet rs=st.executeQuery
String filename="";
String filenames[]=null;
File path=new File(location);
File files[]=path.listFiles();
if(files!=null)
{
for(int i=0;i<files.length;i++)
{
if(files.length==0)
{
out.println("");
}
else
{
// System.out.println("compiling");
51
try
{
filename=files[i].toString();
//String names=filename.replace('\\','&');
//filenames=names.split("&");
//out.println(filenames.length+"<br>");
//Thread.sleep(1000);
// out.println(filenames[filenames.length-
1]+"<br>");
result+=filename+"#";
}
catch (Exception ee)
{
ee.printStackTrace();
}
}
}
}
else
{
//proxy1.jLabel3.setText("completed.....");
}*/
%>
52
8.3 OUTPUT SCREENS
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
CHAPTER 9
TESTING&
VALIDATION
9.TESTING AND VALIDATION
9.1 INTRODUCTION
Software Testing is a critical element of software quality assurance and

represents the ultimate review of specification, design and coding, Testing presents an
interesting anomaly for the software engineer
The next step after coding is testing. Test case design on set of techniques for creation
of test cases that meet overall testing objectives. The main testing objective is to
execute program with the intent of finding an error.
The testing can be performed in two ways:
* White box testing
* Black box testing
WHITE BOX TESTING :
Knowing the internal working of the product, tests can be conducted to

ensure that" all gears mesh", that is, that all components have been adequately
experienced.
Using White box testing methods, the software engineer can derive test
cases that
 Guarantee that all independent parts in a module has been exercised at
least once
--- Basic Path Testing.
 Exercise all logical decisions on their true and false sites — Condition
testing.
 Execute all loops at their boundaries and with their operation bounds —
Loop Testing.
BLACK BOX TESTING :
 Knowing the specified function that a product has been designed

to perform, tests can be conducted that demonstrates each function
is fully operational at the same time searching for errors in each
function.
68
Black Box Testing focuses on functional requirements of the software. Black Box
Testing attempts to find the errors in the following categories:
 Incorrect or missing functions
 Interface Errors
 Errors in data structures or external database access
 Performance Errors
 Initialization and termination Errors
UNIT TESTING :
Unit testing focuses verification effort on the smallest unit of software

design - the module. This is white box oriented.In this project each and every module
is tested in the following ways:
The module interfaces are tested to ensure that information properly flows in and out
of the program unit .The local data structure is examined to ensure data stored
temporarily maintains its integrity during all steps in an algorithm execution.
Boundary conditions are tested to ensure that the module operates properly at
boundaries established to limit or restrict processing.
All independent paths (basic path) through the control structure

areexercised to ensure that all statements in a module have been executed at least
once. Error handling paths are tested.
INTEGRATION TESTING
Integration Testing is a symmetric technique for constructing the

program structure while conducting tests to uncover errors associated with
interfacing. In this project top-down integration was followed where the modules are
integrated by moving downward through the control hierarchy beginning form the
main matter.The next, depth first integration is followed that would integrate all
modules on a major control path of the structure moving vertically through the
structure.
TESTING OBJECTIVES INCLUDE:

1. Testing is a process of executing a program with the intent of finding an error
69
2. A successful test is one that uncovers an undiscovered error
3. A good test case is one that has a probability of finding an as yet undiscovered error
TESTING PRINCIPLES:
 All tests should be traceable to end user requirements

 Tests should be planned long before testing begins
 Testing should begin on a small scale and progress towards testing in large
 Exhaustive testing is not possible
 To be most effective testing should be conducted by a independent third party
TESTING STRATEGIES
A Strategy for software testing integrates software test cases into a series of
well planned steps that result in the successful construction of software. Software
testing is a broader topic for what is referred to as Verification and Validation.
Verification refers to the set of activities that ensure that the software correctly
implements a specific function Validation refers he set of activities that ensure that
the software that has been built is traceable to customer’s requirements.
9.2 TEST CASES:
S.No. Test Type Test case Expected Actual

Output Output
Search.jsp page will

1 Unit/ Login as a success be displayed
Operational/Functional user
2 Unit/ Login as a unsucess Error message will be

Operational/Functional user with displayed
wrong
login
details
3 Unit/ Add a It should add a new
70
Operational/Functional New Entry success record in the
for a user database with new
user details
. Search
4 Unit/Operational/Functional for web success It displays all the web
pages with pages those contain
normal given text
search
Search for
Unit/ web success It displays all the web
5 Operational/Functional pages with pages based on
Paged pages which are
Ranking visited many times
Search.
Search for It displays all the web
6 web pages which are
Unit/ pages with success - particularly related to
Operational/Functional Truthfinder the given text
7 .Unit/ . Search
Operational/Functional for web unsuccess It displays the
pages ErrorPage.jsp
when
there is no
page
contain
the given
text,
Table 9.2.1 Test cases
71
CHAPTER 10
CONCLUSION
10.CONCLUSION
In this project, we introduce and formulate the Veracity problem, which

aims at resolving conflicting facts from multiple websites and finding the true facts
among them. We propose TRUTHFINDER, an approach that utilizes the inter
dependency between website trustworthiness and fact confidence to find trustable
websites and true facts. Experiments show that TRUTHFINDER achieves high
accuracy at finding true facts and at the same time identifies websites that provide
more accurate information.
10.1 SCOPE FOR FUTURE ENHANCEMENT
 In a real time this project shows all the best results for an every search object.
 An admin user can view the requirements of a user by a notification.
 The user can download and upload the data for requirements after his or her
registration.
 The larger data can be loaded in database.
72
CHAPTER 11
REFERENCES
11.REFERENCES
[1] B. Amento, L.G. Terveen, and W.C. Hill, “Does ‘Authority’ Mean Quality?
Predicting Expert Quality Ratings of Web Documents,” Proc. ACM SIGIR ’00, July
2000.
[2] A. Borodin, G.O. Roberts, J.S. Rosenthal, and P. Tsaparas, “Link Analysis
Ranking: Algorithms, Theory, and Experiments,” ACM Trans. Internet Technology,
vol. 5, no. 1, pp. 231-297, 2005.
[3] R. Guha, R. Kumar, P. Raghavan, and A. Tomkins, “Propagation of Trust and
Distrust,” Proc. 13th Int’l Conf. World Wide Web (WWW), 2004.
[4] G. Jeh and J. Widom, “SimRank: A Measure of Structural-Context Similarity,”
Proc. ACM SIGKDD ’02, July 2002.
[5]Logistical Equation from Wolfram MathWorld,
http://mathworld.wolfram.com/LogisticEquation.html, 2008.
[6] T. Mandl, “Implementation and Evaluation of a Quality-Based Search Engine,”
Proc. 17th ACM Conf. Hypertext and Hypermedia, Aug. 2006.
[7] Princeton Survey Research Associates International, “Leap of faith: Using the
Internet Despite the Dangers,” Results of a Nat’l Survey of Internet Users for
Consumer Reports WebWatch, Oct. 2005.
[8] Sigmoid Function from Wolfram MathWorld, http://mathworld.
wolfram.com/SigmoidFunction.html, 2008.
REFERRED
http://www java.sun.com
http://www.java2s.com
http://www.w3schools.com
http://www.microsoft.com/sql/2005/.
73

Dwanshi Shukla

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Dwanshi Shukla

Transféré par

Droits d'auteur :

Formats disponibles

A

TRUTH DISCOVERY WITH MULTIPLE

UNDER THE GUIDANCE OF

SHRI GULAB SINGH MAHAVIDYALAYA AURAIYA

We express my heartfelt thanks to my parents and my family members, who

We express my heartfelt thanks and gratitude to all professors, lab

Internal Project Guide Head of the department

API Application Programming Interface

S.No Fig.No Figure Name Page No.

S.no Table.no Table name Page.No

The Premier and Prestigious Training Division of THE DAWN TECHNO

The Dawn team consists of energetic, dynamic and laborious professionals,

'Our mission is To be a Global Institute aimed at imparting industry specific,

ACADEMIC PROJECT TRAINING PLAN

VIVA conducted by 3rd Year Academic

4th year project 3rd Year Project

4th Year Project hands on,

Fig 1.1 Academic project training plan

The Dawn Corporate Training programs help organizations identify, invent,

The Courses include:

 Limited persons per batch for individual attention

Placement is arranged on successful course completion. These are done both at

In summary, we make three major distributions in this paper. First, we

2.1 PROBLEM DEFINITION:

2.2 OBJECTIVE OF PROJECT

Page Rank and Authority-Hub analysis is to utilize the hyperlinks to find

2.5 PROPOSED SYSTEM:

2.6 ADVANTAGES OF PROPOSED SYSTEM:

Companies with an emphasis on marketing often focus their quality efforts on

4.1 HARDWARE REQUIREMENTS

PROCESSOR : PENTIUM IV 2.6 GHz

4.2 SOFTWARE REQUIREMENTS

FRONT END : Java

The requirement phase basically consists of three activities:

Requirement Analysis is a software engineering task that bridges the gap

It may be divided into 5 areas of effort.

 The information domain of the problem must be represented and

REQUIREMENT ANALYSIS IN THIS PROJECT

5.2 FEASIBILITY STUDY:

Three key considerations are involved in the feasibility analysis

5.2.1 ECONOMIC FEASIBILITY:

5.2.2 TECHNICAL FEASIBILITY:

Technical feasibility centers on the existing computer system (hardware,

The importance of software design can be stated in a single word “Quality”.

IN MODULE GIVEN INPUT AND EXPECTED OUTPUT

Given input : Result set (collection of data)

Result set Specific data

TRUTH FINDER ALGORITHM

 Each object has a set of conflictive facts

Truth Discovery with multiple conflicting information providers on the web

Login Validation Login

Search Engine Conflicting Web

Truth Finder Truth Finder

Fig 6.1b System Architecture

fig 6.2 Context Diagram

6.3 DATA FLOW DIAGRAMS AND UML DIAGRAMS

DFD’s is used model system components.DFD shows how the information

FIRST LEVEL DFD

Fig 6.3b First level DFD

Fig 6.3c First level DFD

Fig 6.3dFirst level DFD

UML is a notation that resulted from the unification of Object

1. USE CASE DIAGRAMS