Vous êtes sur la page 1sur 23

Terminology

Search Engine
Web crawler
Search Engine
A Web search engine is a tool or a program
designed to search for information on the WWW on
the basis of specified keywords and returns a list of
the documents where the keywords were found.
The search results are usually presented in a list and
are commonly called hits. The information may
consist of webpages, images, information and other
types of files
Web Crawler
A Web crawler is a computer program that browses the
WWW in a methodical, automated manner.

Other terms for Web crawlers are


 ants, automatic indexers, bots, and worms or Web
spider, Web robot, Web scutter

This process is called Web crawling or spidering.


Why search engine ???
“ Internet environment is having huge amount of textual
information yet it was just impossible to find anything ”

And thus there is a need of a program or a tool that


keep track of this all information and this is what the
SEARCH ENGINE provides.
Search Engine Working
A search engine operates, in the following order

1. Web crawling :- Document Gathering step


2. Indexing :- Document Arrangement Step
3. Information Extraction and Storing in DB
4.User Request :- Request for specific
“keyword”
5. Searching :- Query Building and
Execution
6.Response to user
History Of Search Engine

Archie
Veronica and Jughead
Excite
Yahoo
Lycos
Alta Vista
Keyword searching
Most common form of text search on the Web
“Keyword” specified by the user is searched
Those keywords would actually tell a user something
about the subject and content of this page.
It's up to the search engine to determine the type keyword
They may refer to the words specified as the title of the
documents or their first line content for “MATCHING ”
purpose
Keyword searching
Problems with keyword searching
Same spelled KEYWORD
Stemming Problem
Synonym Problem
Stemming Problem
Search Engine XYZ

BIG_

GO!

Should I check for the “BIGGER”


“Biggest”……..??
Synonym Problem
Search Engine XYZ

BIG_

GO!

I am not going to return the documents having


synonym of heart , “CARDIAC”
Same spelled KEYWORD
Search Engine XYZ

BIG_

GO!
It will return…………..the following
Hard drive 100KB www.Hddve
Hard Exam 115 KB www.Hdex
.com
Hard stone 105 KB www.Hrdsto
m.cm
.com
Most of these are IRRELEVANT to the user ,
Also the problem of CASE SENSITIVITY
Refined Searching
ADVANCED SEARCH
“Criteria of searching” is given by the user
Uses BOOLEAN operators
Allow the user to
Search entire phrase ,
Field Searching,
 specify what form he would like his results to appear in
,
restrict his search to certain fields on the internet (i.e.,
usenet or the Web)
BOOLEAN operators
Boolean AND
FCC AND WIRELESS
AND
COMMUNICATION

Boolean OR
FCC OR WIRELESS OR
COMMUNICATION
BOOLEAN operators
Boolean AND NOT

Boolean +/- : + AND

- AND NOT
Phrase Search

It will search the entire “Phrase” ,


“San Francisco Art”
Field Search
Most effective technique for narrowing results
A web page is has a number of fields, such as title,
domain, host, URL, and link. Searching effectiveness
increases as you combine field searches with phrase
searches and Boolean logic.

+title:“Thailand" - Image will


return the indicated result
Concept-based searching
Semantic Search
Concept-based search systems try to determine what user
mean
Returns hits on documents that are "about" the
subject/theme user exploring, even if the words in the
document don't precisely match the words you enter
into the query.
Concept-based searching
Builds clustering systems ,and note the no of
frequencies occurring in the document
The higher the frequency , the higher the ranking of
the document

Example
Concept-based searching
Search Engine XYZ

GO!

It will return the documents related to medical\health science


Concept-based searching
Search Engine XYZ

GO!

A concept-oriented search engine returns hits on the


subject of romance.
Popularity
The chart shows the
percentage of online
searches done by US
home and work web
surfers in July 2010
that were performed
at a particular search
engine
Thank You!!!!!!!!!!!
Happy Searching……..

Vous aimerez peut-être aussi