Académique Documents
Professionnel Documents
Culture Documents
PROJECT REPORT
ON
SUBMITTED
TO
CHATRAPATI SHAHU JI MAHARAJ UNIVERSITY KANPUR
in partial fulfilment of the requirements for the award of the
degree of
BACHELOR OF SCIENCE
IN
COMPUTER SCIENCE
SUBMITTED
BY
DEWANSHI SHUKLA
We express my special thanks to Mr. Vinay Kumar Shukla Guidance, for his
valuable guidance and supervision and constructive suggestions to complete this
project.
Dewanshi Shukla
DECLARATION
I here by declare that this project report entitled “Truth Discovery with
multiple conflictinginformation Providers on the Web” is the work done by
Dewanshi Shukla towards the partial fulfilment of the requirement for the award of
the degree of B.Sc. Final Year in DEPARTMENT and submitted to Shri Gulab
Singh Mahadyalaya Auraiya, is the result of the work carried out under the
guidance of Mr. Vinay Kumar Shukla.
I further declare that this project report has not been previously submitted
before either in part or full for the award of any degree or any diploma by any
organization or any university
Dewanshi Shukla
B.Sc. 3rd Year
CERTIFICATE
This is to certify that the project entitled” Truth Discovery with multiple
conflicting information providers n the web” is submitted by DEWANSHI SHUKLA
in partial fulfillment of the requirements for the award of Bachelor of Science in
COMPUTER SCIENCE, Shri Gulab Singh Mahavidyalya Auraiya.
External examiner
CONTENTS
SNO TITTLE PG.NO.
ABSTRACT i
NOTATIONS ii
ABBREVATIONS iii
LIST OF DIAGRAMS/LIST OF FIGURES iv
LIST OF TABLES v
1. COMPANY PROFILE 1
2. INTRODUCTION 5
2.1 Problem Definition 7
2.2 Objective of Project 7
2.3 Existing System 8
2.4 Disadvantages Of Existing System 8
2.5 Proposed System 8
2.6 Advantages Of Proposed System 8
3. LITERATURE SURVEY 9
4. SYSTEM REQUIREMENTS SPECICATION 11
4.1. Hardware Requirements 11
4.2 Software Requirements 11
5. SYSTEM ANALYSIS 12
5.1 Introduction 12
5.2 Feasibility Study 14
5.2.1. Economical Feasibility Study 14
5.2.2 Technical Feasibility Study 15
5.2.3 Operational Feasibility Study 16
6. SYSTEM DESIGN 17
6.1 Introduction 17
6.2 CLD Diagram 20
6.3 DFD/UML/ER Diagram 20
6.4 Tables/DD 29
7. LANGUAGE SPECIFICATION 32
8. IMPLIMENTATION 43
8.1 Screens Design/Forms Design 43
8.2 Source Code 44
8.3 Output Screens /Reports 53
9. TESTING AND VALIDATION 68
9.1 Introduction 68
9.2 Test Cases 70
10. CONCLUSION 72
10.1 Scope for Future Enhancement 72
11. REFERENCES 73
ABSTRACT
The world-wide web has become the most important information source for
most of us. Unfortunately, there is no guarantee for the correctness of information on
the web. Moreover, different web sites often provide conflicting in-formation on a
subject, such as different specifications for the same product. In this paper we propose
a new problem called Veracity that is conformity to truth, which studies how to find
true facts from a large amount of conflicting information on many subjects that is
provided by various web sites. We design a general framework for the Veracity
problem, and invent an algorithm called Truth Finder, which utilizes the relationships
between web sites and their information, i.e., a web site is trustworthy if it provides
many pieces of true information, and a piece of information is likely to be true if it is
provided by many trustworthy web sites. Our experiments show that Truth Finder
successfully finds true facts among conflicting information, and identifies trustworthy
web sites better than the popular search engines.
ii
NOTATIONS
iii
ABBREVIATIONS
iv
LIST OF FIGURES
v
LIST OF TABLES
vi
CHAPTER 1
COMPANY PROFILE
1.COMPANY PROFILE
OVERVIEW1
We try to build a long-term relationship with all our Students who get trained
with us by ensuring timely delivery of quality software trainingand services using
continuously improving processes.
MISSION
1
VISION
'To become the primary resource centre for providing IT training for students
and corporates'.
2
CORPORATE TRAINING
The Dawn offers you corporate training solutions and services that not only
help you prepare your workforce for unparalleled growth, but also accelerate your
organization's competency.
The Dawn has expanded the scope of services to meet the growing demand for
new skill sets in a rapidly changing business environment.
TECHNOLOGIES
The growing need for “industry ready” personnel and for continuous learning in
industry has led The Dawn Techno Solutions to establish a training academy in
Hyderabad. The Dawn Techno Solutions is focusing on the following Courses, which
demand the highest levels of teaching and training excellence.
C/C++
Microsoft Office tools
JAVA/J2EE
.Net Technologies
Oracle (SQL, PL/SQL)
Testing Tools
Data Warehousing
3
Oracle Applications
IBM Mainframes
Spoken English
The Dawn Techno Solutions has a clearly articulated training strategy that includes
PLACEMENTS
The Dawn Techno Solutions also offer placement services and help its
students get the right jobs in right time. Our Institute is actively liaisoning with
industries of repute and keeps the students informed of various job opportunities and
also provides guidance to the students to prepare for the interviews. Our efforts are to
ensure that the brightest candidates be picked up by the top notch companies.
4
CHAPTER 2
INTRODUCTION
2.INTRODUCTION
The World Wide Web has become a necessary part of our lives and might
have become the most important information source for most people. Everyday,
people retrieve all kinds of information from the Web. For example, when shopping
online, people find product specifications from websites like Amazon.com or
ShopZilla.com. When looking for interesting DVDs, they get information and read
movie reviews on websites such as NetFlix.com or IMDB.com. When they want to
know the answer to a certain question, they go to Ask.com or Google.com. “Is the
World Wide Web always trustable?” Unfortunately, the answer is “no.” There is no
guarantee for the correctness of information on the Web. Even worse, different
websites often provide conflicting information, as shown in the following examples.
Example (Height of Mount Everest). Suppose a user is interested in how high
Mount Everest is and queries Ask.com with “What is the height of Mount Everest?”
Among the top 20 results, 1 he or she will find the following facts: four websites
(including Ask.com itself) say 29,035 feet, five websites say 29,028 feet, one says29,
002 feet, and another one says 29,017 feet. Which answer should the user trust?
Example (Authors of books). We tried to find out who wrote the book Rapid
Contextual Design (ISBN: 0123540518). We found many different sets of authors
from different online bookstores, and we show several of them in Table 1. From the
image of the book cover, we found that A1 Books provides the most accurate
information. In comparison, the information from Powell’s books is incomplete, and
that from Lakeside books is incorrect.
The trustworthiness problem of the Web has been realized by today’s Internet
users. According to a survey on the credibility of websites conducted by Princeton
Survey Research in 2005, 54 percent of Internet users trust news websites at least
most of time, while this ratio is only 26 percent for websites that offer products for
sale and is merely 12 percent for blogs. There have been many studies on ranking web
pages according to authority (or popularity) based on hyperlinks. The most influential
studies are Authority-Hub analysis, and Page Rank, which lead to Google.com.
However, does authority lead to accuracy of information? The answer is unfortunately
no. Top-ranked websites are usually the most popular ones. However, popularity does
5
not mean accuracy. For example, according to our experiments, the bookstores ranked
on top by Google (Barnes & Noble and Powell’s books) contain many errors on book
author information. In comparison, some small bookstores (e.g., A1 Books) provide
more accurate information.
In this project, we propose a new problem called the Veracity problem, which
is formulated as follows: Given a large amount of conflicting information about many
objects, which is provided by multiple websites (or other types of information
providers), how can we discover the true fact about each object? We use the word
“fact” to represent something that is claimed as a fact by some website, and such a
fact can be either true or false. In this paper, we only study the facts that are either
properties of objects (e.g., weights of laptop computers) or relationships between two
objects (e.g., authors of books). We also require that the facts can be parsed from the
web pages. There are often conflicting facts on the Web, such as different sets of
authors for a book. There are also many websites, some of which are more
trustworthy than others.2 A fact is likely to be true if it is provided by trustworthy
websites (especially if by many of them).
A website is trustworthy if most facts it provides are true. Because of this
interdependency between facts and websites, we choose an iterative computational
method. At each iteration, the probabilities of facts being true and the trustworthiness
of websites is inferred from each other. This iterative procedure is rather different
from Authority-Hub analysis. Thus, we cannot compute the trustworthiness of a
website by adding up the weights of its facts as in, nor can we compute the probability
of a fact being true by adding up the trustworthiness of websites providing it. Instead,
we have to resort to probabilistic computation. Second and more importantly,
different facts influence each other. For example, if a website says that a book is
written by “Jessamyn Wendell” and another says “Jessamyn Burns Wendell,” then
these two websites actually support each other although they provide slightly different
facts. We incorporate such influences between facts into our computational model.
6
iterative methods. Our experiments show that TRUTHFINDER achieves very high
accuracy in discovering true facts, and it can select better trustworthy websites than
authority-based search engines such as Google.
The world-wide web has become the most important information source for most of
us. Unfortunately, there is no guarantee for the correctness of information on the web.
Moreover, different web sites often provide conflicting in-formation on a subject,
such as different specifications for the same product. In this paper we propose a new
problem called Veracity that is conformity to truth, which studies how to find true
facts from a large amount of conflicting information on many subjects that is provided
by various web sites. We design a general framework for the Veracity problem, and
invent an algorithm called Truth Finder, which utilizes the relationships between web
sites and their information, i.e., a web site is trustworthy if it provides many pieces of
true information, and a piece of information is likely to betrue if it is provided by
many trustworthy web sites. Our experiments show that Truth Finder successfully
finds true facts among conflicting information, and identifies trustworthy web sites
better than the popular search engines.
The world-wide web has become the most important information source for
most of us. Unfortunately, there is no guarantee for the corectness of information on
the web. Moreover, different web sites often provide conflicting in-formation on a
subject, such as different specifications for the same product. In this project we
propose a new problem called Veracity. We design a general framework for the
Veracity problem, and invent an algorithm called Truth Finder.
2.3 EXISTING SYSTEM:
7
2.4 DISADVANTAGES OF EXISTING SYSTEM:
The popularity of web pages does not necessarily lead to accuracy of
information.Even the most popular website may contain many errors. Where as some
comparatively not-so-popular websites may provide more accurate information
.
We formulate the Veracity problem about how to discover true facts from
conflicting information.Second, we propose a framework to solve this problem, by
defining the trustworthiness of websites, confidence of facts, and influences between
facts. Finally, we propose an algorithm called TRUTHFINDER for identifying true
facts using iterative methods.
8
CHAPTER 3
LITERATURE
SURVEY
3.LITERATURE SURVEY
DATA QUALITY
Data quality is the quality of data. Data are of high quality "if they are fit for
their intended uses in operations, decision making and planning" (J. M.
Juran). Alternatively, the data are deemed of high quality if they correctly
represent the real-world construct to which they refer. These two views can
often be in disagreement, even about the same set of data used for the same
purpose.
Before the rise of the inexpensive server, massive mainframe computers were
used to maintain name and address data so that the mail could be properly
routed to its destination. The mainframes used business rules to correct
common misspellings and typographical errors in name and address data, as
well as to track customers who had moved, died, gone to prison, married,
divorced, or experienced other life-changing events. Government agencies
began to make postal data available to a few service companies to cross-
reference customer data with the National Change of Address registry
(NCOA). This technology saved large companies millions of dollars
compared to manually correcting customer data. Large companies saved on
postage, as bills and direct marketing materials made their way to the
intended customer more accurately. Initially sold as a service, data quality
moved inside the walls of corporations, as low-cost and powerful server
technology became available.
9
certain standard has value to an organization by: 1) avoiding overstocking of
similar but slightly different stock; 2) improving the understanding of vendor
purchases to negotiate volume discounts; and 3) avoiding logistics costs in
stocking and shipping parts across a large organization.
While name and address data has a clear standard as defined by local postal
authorities, other types of data have few recognized standards. There is a
movement in the industry today to standardize certain non-address data. The
non-profit group GS1 is among the groups spearheading this movement.
For companies with significant research efforts, data quality can include
developing protocols for research methods, reducing measurement error,
bounds checking of the data, cross tabulation, modeling and outlier detection,
verifying data integrity, etc.
10
CHAPTER 4
SYSTEM REQUIRMENTS
SPECIFICATION
4. SYSTEM REQUIREMENTS SPECIFICATION
11
CHAPTER 5
SYSTEM ANALYSIS
5.SYSTEM ANALYSIS
5.1 INTRODUCTION
REQURIEMENTS ANALYSIS
Requirement Analysis
Requirement Specification
Requirement Validation
REQUIREMENT ANALYSIS:
The basic aim of this stage is to obtain a clear picture of the needs and
requirements of the end-user and also the organization. Analysis involves interaction
between the clients and the analysis. Usually analysts research a problem by asking
questions and reading existing documents. The analysts have to uncover the real
needs of the user even if they don’t know them clearly. During analysis it is essential
that a complete and consistent set of specifications emerge for the system. Here it is
essential to resolve the contradictions that could emerge from information got from
various parties. This is essential to ensure that the final specifications are consistent.
Problem recognition
Evaluation and synthesis
Modeling
Specification
Review
12
Each Requirement analysis method has a unique point of view. However all
analysis methods are related by a set of operational principles.
They are
The main aim in this stage is to assess what kind of a system would be suitable
for a problem and how to build it. The requirements of this system can be defined by
going through the existing system and its problems. They discussing (speak) about
the new system to be built and their expectations from it. The steps involved would
be
PROBLEM RECOGNITION:
The main problem is here while taking the appointments for the Doctors. If we
want to verify the old data or historical data it is very difficult to findout. Maintain the
data related to all departments is very difficult.
13
EVALUATION AND SYNTHESIS:
In the proposed system this application saves the lot of time, and it is time
saving process when we use this application. Using this application we can easy to
manage daily treatments and easy to maintain the historical data. No specific training
is required for the employees to use this application. They can easily use the tool that
decreases manual hours spending for normal things and hence increases the
performance.
All projects are feasible given unlimited resources and infinite time. But the
development of software is plagued by the scarcity of resources and difficult delivery
rates. It is both necessary and prudent to evaluate the feasibility of a project at the
earliest possible time.
This procedure is to determine the benefits and savings that are expected from
a candidate system and compare them with costs. If benefits outweigh costs, then the
decision is made to design and implement the system. Otherwise, further justification
or alterations in proposed system will have to be made if it is to have a chance of
being approved. This is an ongoing effort that improves in accuracy at each phase of
the system life cycle.
14
FINANCIAL FEASIBILITY
I) TIME BASED:Contrast to the manual system management can generate any report
just by single click. In manual system it is too difficult to maintain historical data
which become easier in this system. Time consumed to add new records or to view
the reports is very less compared to manual system. So this project is feasible in this
point of view
II) COST BASED: No special investment need to manage the tool. No specific
training is required for employees to use the tool. Investment requires only once at the
time of installation. The software used in this project is freeware so the cost of
developing the tool is minimal and hence the overall cost.
The System has been added with features of menu-driven and button
interaction methods, which makes the user the master as he starts working through the
environment. The net time the customer should concentrate is on the installation time.
15
5.2.3 OPERATIONAL FEASIBILITY:
People are inherently resistant to change, and computers have been known to
facilitate change. It is understandable that the introduction of a candidate system
requires special effort to educate, sell, and train the staff on new ways of conducting
business.
16
CHAPTER 6
SYSTEM DESIGN
6.SYSTEM DESIGN
6.1 INTRODUCTION:
The most creative and challenging phase of the life cycle is system design.
The term design describes a final system and the process by which it is developed. It
refers to the technical specifications that will be applied in implementations the
candidate system. The design may be defined as “the process of applying various
techniques and principles for the purpose of defining a device, a process or a system
in sufficient details to permit its physical realization”.
The designer’s goal is how the output is to be produced and in what format
samples of the output and input are also presented. Second input data and database
files have to be designed to meet the requirements of the proposed output. The
processing phases are handled through the program Construction and Testing.
Finally, details related to justification of the system and an estimate of the impact of
the candidate system on the user and the organization are documented and evaluated
by management as a step toward implementation.
17
MODULE DESCRIPTION
MODULES :
Collection of unrelated data
Data search
Truth Finder algorithm
Result calculation
MODULE DESCRIPTION
COLLECTION OF DATA
First we have to collect the specific data about an object and it is stored in
related database. Create table for specific object and store the facts about a particular
object.
DATA SEARCH
Searching the related data link according to user input. In this module user
retrieve the specific data about an object.
TRUTH ALGORITHM
We design a general framework for the Veracity problem, and invent an algorithm
called Truth Finder, which utilizes the relationships between web sites and their
information, i.e., a web site is trustworthy if it provides many pieces of true
information, and a piece of information is likely to be true if it is provided by many
trustworthy web sites.
RESULT CALCULATION
For each response of the query we are calculating the Performance. Using the
count calculated find the best link and show as the out put.
18
Module diagram:
SYSTEM ARCHITECTURE:
Home
Query Process
Output
19
6.2 CONTEXT LEVEL DIAGRAM:
0.0
User TruthFinder User
System
20
Web sites Facts Objects
W1 f1
o1
W2 f2
W3 f3
o2
W4 f4
Fig 6.3a DFD
Store Info on DB
Validate
User
Enter uname
password Valid user
User Login Fact info
correct site
21
Search For Gives Many
1.0
information Sites
Query For a site Truth Finder Query For a site
System
DB Info
Many sites
1.1
Displays Selects Best Site
Best Site Truth Finder Best Site
System
UML DIAGRAMS:
INTRODUCTION:
22
AN OVERVIEW OF UML WITH FIVE DIAGRAMS
truth discovery
(from Logical View)
home
user
search
truthfinder
display the details
23
2. CLASS DIAGRAMS
Class diagrams to describe the structure of the system. Classes are abstraction that
specify the common structure and behavior of a set Class diagrams describe the
system in terms of objects, classes, attributes, operations and their associations.
24
3. SEQUENCE DIAGRAMS
Sequence diagrams are used to formalize the behavior of the system and to visualize
the communication among objects. They are useful for identifying additional objects that
participate in the use cases. A Sequence diagram represents the interaction that take place
among these objects.
enters
login details
enter uid
invalid
invalid
valid
Query
25
4.COLLABORATION DIAGRAM:
26
5. STATECHART DIAGRAMS
Home
Login
Query Process
Database
Conflicting
Information
Truthfinder
Result
27
6. ACTIVITY DIAGRAMS
28
6.4 TABLES IN MS SQL SERVER:
29
Table 6.4.1 dbo.factdb
30
Table 6.4.4 dbo.support
31
CHAPTER 7
LANGUAGE
SPECIFICATION
7.LANGUAGE SPECIFICATION
TECHNOLOGIES FEATURES:
JAVA
Simple
Architecture-neutral
Object-oriented
Portable
Secure
Distributed
Interpreted
Robust
Java is also unusual in that each Java program is both compiled and
interpreted. With a compile you translate a Java program into an
intermediate language called Java byte codes the platform-independent
code instruction is passed and run on the computer.
Compilers My Program
32
fig 7.1 des of compiler and interpreter
You can think of Java byte codes as the machine code instructions for
the Java Virtual Machine (Java VM). Every Java interpreter, whether it’s a
Java development tool or a Web browser that can run Java applets, is an
implementation of the Java VM. The Java VM can also be implemented in
hardware.
Java byte codes help make “write once, run anywhere” possible. You
can compile your Java program into byte codes on my platform that has a
Java compiler. The byte codes can then be run any implementation of the
Java VM. For example, the same Java program can run Windows NT,
Solaris, and Macintosh.
JAVA PLATFORM
You’ve already been introduced to the Java VM. It’s the base for the
Java platform and is ported onto various hardware-based platforms. The Java
API is a large collection of ready-made software components that provide
many useful capabilities, such as graphical user interface (GUI) widgets.
How does the Java API support all of these kinds of programs? With
packages of software components that provide a wide range of functionality.
The API is the API included in every full implementation of the
platform.The core API gives you the following features:The Essentials:
Objects, Strings, threads, numbers, input and output, data structures, system
properties, date and time, and so on.
Java Program
Java API
Java Program
Hard Ware
API and Virtual Machine insulates the Java program from hardware
dependencies. As a platform-independent environment, Java can be a bit
slower than native code. However, smart compilers, well-tuned interpreters,
and Just-in-time-byte-code compilers can bring Java’s performance close to
the native code without threatening portability.
However, Java is not just for writing cut, entertaining applets for the
World Wide Web (WWW). Java is a general purpose, high-level
programming language and a powerful software platform. Using the fineries
Java API, you can write many types of programs.
Internet addresses
In order to use a service, you must be able to find it. The Internet uses an
address scheme for machines so that they can be located. The address is a 32 bit
integer which gives the IP address. This encodes a network ID and more addressing.
The network ID falls into various classes according to the size of the network address.
34
Network address
Class A uses 8 bits for the network address with 24 bits left over for other
addressing. Class B uses 16 bit network addressing. Class C uses 24 bit network
addressing and class D uses all 32.
Subnet address
Host address
8 bits are finally used for host addresses within our subnet. This places a limit
of 256 machines that can be on the subnet.
Port addresses
A service exists on a host, and is identified by its port. This is a 16 bit number.
To send a message to a server, you send it to the port for that service of the host that it
is running on. This is not location transparency! Certain of these ports are "well
known".
Sockets
A socket is a data structure maintained by the system to handle network
connections. A socket is created using the call socket. It returns an integer that is like
a file descriptor.
Server Socket
A Server Socket listens for the Socket request and performs message handling
functions, file sharing, database sharing functions etc.
JDBC
In an effort to set an independent database standard API for Java, Sun
Microsystems developed Java Database Connectivity, or JDBC. JDBC offers a
generic SQL database access mechanism that provides a consistent interface to a
variety of RDBMS. This consistent interface is achieved through the use of “plug-in”
35
database connectivity modules, or drivers. If a database vendor wishes to have JDBC
support, he or she must provide the driver for each platform that the database and Java
run on.
To gain a wider acceptance of JDBC, Sun based JDBC’s framework on
ODBC. As you discovered earlier in this chapter, ODBC has widespread support on a
variety of platforms. Basing JDBC on ODBC will allow vendors to bring JDBC
drivers to market much faster than developing a completely new connectivity
solution.
JDBC Goals
Few software packages are designed without goals in mind. JDBC is one that,
because of its many goals, drove the development of the API. These goals, in
conjunction with early reviewer feedback, have finalized the JDBC class library into a
solid framework for building database applications in Java.
The goals that were set for JDBC are important. They will give you some insight
as to why certain classes and functionalities behave the way they do. The eight design
goals for JDBC are as follows:
SQL syntax varies as you move from database vendor to database vendor. In
an effort to support a wide variety of vendors, JDBC will allow any query
statement to be passed through it to the underlying database driver. This allows
the connectivity module to handle non-standard functionality in a manner that is
suitable for its users.
3. JDBC must be implemental on top of common database interfaces
36
The JDBC SQL API must “sit” on top of other common SQL level APIs. This
goal allows JDBC to use existing ODBC level drivers by the use of a software
interface. This interface would translate JDBC calls to ODBC and vice versa.
4. Provide a Java interface that is consistent with the rest of the Java system
Because of Java’s acceptance in the user community thus far, the designers
feel that they should not stray from the current design of the core Java system.
5.Keep it simple
This goal probably appears in all software design goal listings. JDBC is no
exception. Sun felt that the design of JDBC should be very simple, allowing for
only one method of completing a task per mechanism. Allowing duplicate
functionality only serves to confuse the users of the API.
About JSP:
JavaServer Pages (JSP) technology enables you to mix regular, static HTML
with dynamically generated content from servlets. You simply write the regular
HTML in the normal manner, using familiar Web-page-building tools. You then
enclose the code for the dynamic parts in special tags, most of which start with <%
and end with %>. For example, here is a section of a JSP page that results in “Thanks
for ordering
Core Web Programming” for a URL of http://host/OrderConfirmation.
jsp?title=Core+Web+Programming:
Thanks for ordering <I><%= request.getParameter("title") %></I> Separating the
static HTML from the dynamic content provides a number of benefits over servlets
alone, and the approach used in JavaServer Pages offers several advantages over
competing technologies such as ASP, PHP, or ColdFusion. Section 1.4 (The
Advantages of JSP) gives some details on these advantages, but they basically boil
down to two facts: that JSP is widely supported and thus doesn’t lock you into a
particular operating system or Web server and that JSP gives you full access to servlet
and Java technology for the dynamic part, rather than requiring you to use an
unfamiliar and weaker special- purpose language. The process of making JavaServer
Pages accessible on the Web is much simpler than that for servlets. Assuming you
have a Web server that supports JSP, you give your file a .jsp extension and simply
install it in any place you could put a normal Web page: no compiling, no packages,
37
and no user CLASSPATH settings. However, although your personal environment
doesn’t need any special settings, the server still has to be set up with access to the
servlet and JSP class files and the Java compiler. For details, see your server’s
documentation or Section 1.5 (Installation and Setup).Although what you write often
looks more like a regular HTML file than a servlet, behind the scenes, the JSP page is
automatically converted to a normal servlet, with the static HTML simply being
printed to the output stream associated with the servlet’s service method. This
translation is normally done the first time the page is requested. To ensure that the
first real user doesn’t get a momentary delay when the JSP page is translated into a
servlet and compiled, developers can simply request the page themselves after first
installing it. Many Web servers also let you define aliases so that a URL that appears
to reference an HTML file really points to a servlet or JSP page. Depending on how
your server is set up, you can even look at the source code for servlets generated from
your JSP pages. With Tomcat 3.0, you need to change the isWorkDirPersistent
attribute from false to true in install_dir/server.xml. After that, the code can be found
in install_dir/work/port-number. With the JSWDK 1.0.1, you need to change the
workDirIsPersistent attribute from false to true in install_dir/webserver.xml. After
that, the code can be found in install_dir/work/%3Aport-number%2F. With the Java
Web Server, 2.0 the default setting is to save source code for automatically generated
servlets. They can be found in install_dir/tmpdir/default/pagecompile/jsp/_JSP.
One warning about the automatic translation process is in order. If you make an error
in the dynamic portion of your JSP page, the system may not be able to properly
translate it into a servlet. If your page has such a fatal translation-time error, the server
will present an HTML error page describing the problem to the client. Internet
Explorer 5, however, typically replaces server-generated error messages with a
canned page that it considers friendlier. You will need to turn off this “feature” when
debugging JSP pages. To do so with Internet Explorer 5, go to the Tools menu, select
Internet Options, choose the Advanced tab, and make sure “Show friendly HTTP
error messages” box is not checked.
Aside from the regular HTML, there are three main types of JSP constructs
that you embed in a page: scripting elements, directives, and actions. Scripting
elements let you specify Java code that will become part of the resultant servlet,
directives let you control the overall structure of the servlet, and actions let you
38
specify existing components that should be used and otherwise control the behavior of
the JSP engine. To simplify the scripting elements, you have access to a number of
redefined variables, such as request in the code snippet just shown (see Section 10.5
for more details). Scripting elements are covered in this chapter, and directives and
actions are explained in the following chapters. You can also refer to Appendix
(Servlet and JSP
Quick Reference) for a thumbnail guide summarizing JSP syntax.
JSP changed dramatically from version 0.92 to version 1.0, and although these
changes are very much for the better, you should note that newer JSP pages are almost
totally incompatible with the early 0.92 JSP engines, and older JSP pages are equally
incompatible with 1.0 JSP engines. The changes from version 1.0 to 1.1 are much less
dramatic: the main additions in version 1.1 are the ability to portably define new tags
and the use of the servlet 2.2 specification for the underlying servlets. JSP 1.1 pages
that do not use custom tags or explicitly call 2.2-specific statements are compatible
with JSP 1.0 engines, and JSP 1.0 pages are totally upward compatible with JSP 1.1
engines.
The next release of SQL Server is designed to help enterprises address these
challenges. SQL Server 2005 is Microsoft’s next generation data management and
analysis solution that will deliver increased security, scalability, and availability to
enterprise data and analytical applications while making them easier to create, deploy,
and manage. Building on the strengths of SQL Server 2000, SQL Server 2005
will provide an integrated data management and analysis solution that will help
organizations of any size to:
Build and deploy enterprise applications that are more secure, scalable, and reliable.
Maximize the productivity of IT by reducing the complexity of creating, deploying,
and managing database applications.
39
Empower developers through a rich, flexible, modern development environment for
creating more secure database applications.
Share data across multiple platforms, applications, and devices to make it easier to
connect internal and external systems.
Deliver robust, integrated business intelligence solutions that help drive informed
business decisions and increase productivity across your entire organization.
Control costs without sacrificing performance, availability, or scalability. Read on to
learn more about the advancements SQL Server 2005 will deliver in three key areas:
enterprise data management, developer productivity, and business intelligence.
40
Developer Productivity
One of the key barriers to developer productivity has been the lack of
integrated tools for database development and debugging. SQL Server 2005 will
provide advancements that fundamentally change the way that database applications
are developed and deployed.
Enhancements for developer productivity will include:
Improved tools. Developers will be able to utilize one development tool for
Transact-SQL, XML, Multidimensional Expression (MDX), and XML for Analysis
(XML/A).
Expanded language support. With the common language runtime (CLR) hosted in
the database engine, developers will be able to choose from a variety of familiar
languages to develop database applications, including Transact-SQL, Microsoft Visual
Basic® .NET, and Microsoft Visual C#.NET.
XML and Web services. SQL Server 2005 will support both relational and XML data
natively, so enterprises can store, manage, and analyze data in the format that best
suits their needs.
Support for existing and emerging open standards such as Hypertext Transfer
Protocol (HTTP), XML, Simple Object Access Protocol (SOAP), XQuery, and XML
Schema Definition (XSD) will also facilitate communication across extended
enterprise systems.
Business Intelligence
The challenge and promise of business intelligence revolves around providing
employees with the right information, at the right time. Accomplishing this vision
demands a business intelligence solution that is comprehensive, secure, integrated
with operational systems, and available all day, every day. SQL Server will help
companies to achieve this goal with SQL Server 2005.
Business intelligence advancements will include:
Integrated platform. SQL Server 2005 will deliver an end-to-end business
intelligence platform with integrated analytics including online analytical processing
(OLAP); data mining; extract, transformation, and load (ETL) tools; data
warehousing; and reporting functionality.
41
Improved decision making. Advancements to existing business intelligence features,
such as OLAP and data mining, and the introduction of a new reporting server will
provide enterprises with the ability to transform information into better business
decisions at all organizational levels.
Security and availability. Scalability, availability, and security enhancements will
help to provide users with uninterrupted access to business intelligence applications
and reports.
Enterprise-wide analytical capabilities. An improved ETL tool will enable
organizations to more easily integrate and analyze data from multiple
heterogeneous information sources. By analyzing data across a wide array of
operational systems, organizations may gain a competitive edge through a holistic
understanding of their business.
Additional Information
SQL Server 2005 is part of the Windows Server System– a comprehensive and
integrated server infrastructure that simplifies the development, deployment and
operations of a flexible business solution.
42
CHAPTER 8
IMPLEMENTATION
8. IMPLEMENTATION
The development stage takes as its primary input the design elements described in the
approved design document. For each design element, a set of one or more software artifacts
will be produced., Appropriate test cases will be developed for each set of functionally related
software artifacts, and an online help system will be developed to guide users in their
interactions with the software.
43
8.2 SOURCE CODE
<script language="javascript">
function number(field)
{
var input=field.value;
var input=field.value;
var len=input.length;
var status=true;
for(var i=0;i<len;i++)
{
var chars=input.substring(i,i+1);
if(chars < "0" || chars > "9")
{
status=false;
break;
}
}
if(status==false)
{
alert("Enter the Numeric Input...");
document.form1.field.focus();
//return false;
function validate()
{
44
var user=document.form1.T1.value;
var pass=document.form1.T2.value;
var cate=document.form1.T3.value;
//alert("hai"+user+pass+cate)
if(user.length<1)
{
alert(" Enter The Username....");
document.form1.T1.focus();
return false;
}
else if(pass.length<1)
{
alert(" Enter The Password....");
document.form1.T2.focus();
return false;
}
else if(cate=="Select Type")
{
alert(" Select The User Type....");
document.form1.T3.focus();
return false;
}
return true;
// -->
</script>
45
Source code for index
</style>
<![endif]-->
<script type="text/javascript">
<!--
function newImage(arg) {
if (document.images) {
rslt = new Image();
rslt.src = arg;
return rslt;
}
}
function changeImages() {
if (document.images && (preloadFlag == true)) {
for (var i=0; i<changeImages.arguments.length; i+=2) {
document[changeImages.arguments[i]].src =
changeImages.arguments[i+1];
}
}
}
46
btn_services_over = newImage("images/btn_services-
over.gif");
preloadFlag = true;
}
}
// -->
</script>
function newImage(arg) {
if (document.images) {
rslt = new Image();
rslt.src = arg;
return rslt;
}
}
function forward( )
{
<% String text=request.getParameter("textfield");
String search1=request.getParameter("search");
request.setAttribute("text",text);
request.setAttribute("search",search1);%>
document.location="response.jsp";
}
function forward1( )
47
{
document.location="Bestone.jsp";
}
function changeImages() {
if (document.images && (preloadFlag == true)) {
for (var i=0; i<changeImages.arguments.length; i+=2) {
document[changeImages.arguments[i]].src =
changeImages.arguments[i+1];
}
}
}
48
Class.forName("sun.jdbc.odbc.JdbcOdbcDriver");
Connection
con=DriverManager.getConnection("jdbc:odbc:truth","sa","");
Statement st=con.createStatement();
String searchword=(String)application.getAttribute("text");
String method=(String)application.getAttribute("method");
String user=(String)session.getAttribute("uname");
//out.println(searchword+"<br>"+method);
ResultSet rs=st.executeQuery("select location from MainDb where
filename='"+searchword+"'");
String location="";
while(rs.next())
{
location=rs.getString(1);
out.println(location);
}
String filename="";
String filenames[]=null;
File path=new File(location);
File files[]=path.listFiles();
if(files!=null)
{
for(int i=0;i<files.length;i++)
{
if(files.length==0)
{
out.println("");
}
else
{
// System.out.println("compiling");
try
{
filename=files[i].toString();
49
names=filename.replace('\\','&')
//filenames=names.split("&");
//out.println(filenames.length+”<br>”);
//Thread.sleep(1000);
out.println(filenames[filenames.length-1]+"<br>");
out.println(filename+"<br>");
}
catch (Exception ee)
{
ee.printStackTrace();
}
}
}
}
else
{
//proxy1.jLabel3.setText("completed.....");
}
%>
<%!
String resultfiles[];
int retrive;
String factlength;
%>
<%
String result="";
String valu=request.getParameter("value");
String searchword="";
factlength=(String)application.getAttribute("fact");
50
int fac=Integer.parseInt(factlength);
int retrive=fac-2;
if(valu.equalsIgnoreCase("redirect"))
{
searchword=(String)application.getAttribute("text");
//out.println(searchword);
}
else
{
searchword=(String)application.getAttribute("text");
}
//String resultfile=request.getParameter("result");
//resultfiles=resultfile.split("$");
Class.forName("sun.jdbc.odbc.JdbcOdbcDriver");
Connection con=DriverManager.getConnection("jdbc:odbc:truth","sa","");
Statement st=con.createStatement();
String method=(String)request.getAttribute("method");
String user=(String)session.getAttribute("uname");
ResultSet rs=st.executeQuery("select location from support where
about='"+searchword+"' order by support desc");
String location="";
while(rs.next())
{
result+=rs.getString(1)+"#";
}
//out.println(result);
/* ResultSet rs=st.executeQuery
String filename="";
String filenames[]=null;
File path=new File(location);
File files[]=path.listFiles();
if(files!=null)
{
for(int i=0;i<files.length;i++)
{
if(files.length==0)
{
out.println("");
}
else
{
// System.out.println("compiling");
51
try
{
filename=files[i].toString();
//String names=filename.replace('\\','&');
//filenames=names.split("&");
//out.println(filenames.length+"<br>");
//Thread.sleep(1000);
// out.println(filenames[filenames.length-
1]+"<br>");
result+=filename+"#";
}
catch (Exception ee)
{
ee.printStackTrace();
}
}
}
}
else
{
//proxy1.jLabel3.setText("completed.....");
}*/
%>
52
8.3 OUTPUT SCREENS
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
CHAPTER 9
TESTING&
VALIDATION
9.TESTING AND VALIDATION
9.1 INTRODUCTION
The next step after coding is testing. Test case design on set of techniques for creation
of test cases that meet overall testing objectives. The main testing objective is to
execute program with the intent of finding an error.
The testing can be performed in two ways:
* White box testing
* Black box testing
WHITE BOX TESTING :
68
Black Box Testing focuses on functional requirements of the software. Black Box
Testing attempts to find the errors in the following categories:
Incorrect or missing functions
Interface Errors
Errors in data structures or external database access
Performance Errors
Initialization and termination Errors
UNIT TESTING :
INTEGRATION TESTING
TESTING PRINCIPLES:
TESTING STRATEGIES
A Strategy for software testing integrates software test cases into a series of
well planned steps that result in the successful construction of software. Software
testing is a broader topic for what is referred to as Verification and Validation.
Verification refers to the set of activities that ensure that the software correctly
implements a specific function Validation refers he set of activities that ensure that
the software that has been built is traceable to customer’s requirements.
70
Operational/Functional New Entry success record in the
for a user database with new
user details
. Search
4 Unit/Operational/Functional for web success It displays all the web
pages with pages those contain
normal given text
search
Search for
Unit/ web success It displays all the web
5 Operational/Functional pages with pages based on
Paged pages which are
Ranking visited many times
Search.
Search for It displays all the web
6 web pages which are
Unit/ pages with success - particularly related to
Operational/Functional Truthfinder the given text
7 .Unit/ . Search
Operational/Functional for web unsuccess It displays the
pages ErrorPage.jsp
when
there is no
page
contain
the given
text,
71
CHAPTER 10
CONCLUSION
10.CONCLUSION
In a real time this project shows all the best results for an every search object.
An admin user can view the requirements of a user by a notification.
The user can download and upload the data for requirements after his or her
registration.
The larger data can be loaded in database.
72
CHAPTER 11
REFERENCES
11.REFERENCES
[1] B. Amento, L.G. Terveen, and W.C. Hill, “Does ‘Authority’ Mean Quality?
Predicting Expert Quality Ratings of Web Documents,” Proc. ACM SIGIR ’00, July
2000.
[2] A. Borodin, G.O. Roberts, J.S. Rosenthal, and P. Tsaparas, “Link Analysis
Ranking: Algorithms, Theory, and Experiments,” ACM Trans. Internet Technology,
vol. 5, no. 1, pp. 231-297, 2005.
[3] R. Guha, R. Kumar, P. Raghavan, and A. Tomkins, “Propagation of Trust and
Distrust,” Proc. 13th Int’l Conf. World Wide Web (WWW), 2004.
[4] G. Jeh and J. Widom, “SimRank: A Measure of Structural-Context Similarity,”
Proc. ACM SIGKDD ’02, July 2002.
[5]Logistical Equation from Wolfram MathWorld,
http://mathworld.wolfram.com/LogisticEquation.html, 2008.
[6] T. Mandl, “Implementation and Evaluation of a Quality-Based Search Engine,”
Proc. 17th ACM Conf. Hypertext and Hypermedia, Aug. 2006.
[7] Princeton Survey Research Associates International, “Leap of faith: Using the
Internet Despite the Dangers,” Results of a Nat’l Survey of Internet Users for
Consumer Reports WebWatch, Oct. 2005.
[8] Sigmoid Function from Wolfram MathWorld, http://mathworld.
wolfram.com/SigmoidFunction.html, 2008.
REFERRED
http://www java.sun.com
http://www.java2s.com
http://www.w3schools.com
http://www.microsoft.com/sql/2005/.
73