Vous êtes sur la page 1sur 15

Vinod Gupta School of Management, IIT Kharagpur

MIS Term Paper

The Future of Web Search

Submitted To: Submitted By:

Dr. Prithwis Mukerjee Amod Kumar Gupta

10BM60007
Abstract

The internet was made available for public use in the mid 1990s.Since then it has changed our life in
a way few other things have been able to, in the past. The internet consists of nearly 487bn gigabytes
(GB) of data. A search engine helps us find what we want in this endless sea of data. It is up to the
search engine to prevent us from getting lost. So search engines are becoming increasingly important
in the internet world. This paper will cover the current search engine technologies, problems with the
current technology and the improvements to build better web search engines.

Introduction

The number of internet users are around 1.97 billion as of 30 June 2010. It is incorporated in
virtually all aspect of modern human life. The Internet consists of a vast range
of information resources and services. Buried in which lies information of interest . The trick is to find
it. This is where search engines play a critical role. A web search engine is designed to search for
information, resulting in the generation of a list of results. The result will consist of web pages,
images, video and other types of files.

Archie was one of the first search engines.. It was created in 1990 by Alan Emtage, Bill
Heelan and J. Peter Deutsch, at McGill University in Montreal. The working of Archie was very simple
compared to current search engines. It downloaded the directory listings of all the files located on
public sites, and created a searchable database of file names. WebCrawler, came out in 1994,allowed
the users to search for any word in any webpage, which has now become the standard. Lycos was
launched in 1994 and became a major success. Yahoo!, allowed the search on its web directory only,
rather than all the web pages like other search engines. Users could also browse the directory instead
of doing a keyword-based search. Search engines attracted a lot of investments in the Internet
investing frenzy that occurred in the late 1990s.Several companies received record gains during
their initial public offerings. Some search engines have enterprise-only editions, such as Northern
Light. Many search engine companies were also caught up in the dot-com bubble, ending in their
demise.

Around 2000, the Google search engine rose to prominence. The main difference between
Google and other search engines was Google focused on search, they did not sacrifice the quality of
the web search just to make quick money through advertising. Their PageRank algorithm ranks web
pages based on the number of pages that link there, on the premise that good or desirable pages are
linked to more than others.

Working off comScore figures from December 2009 for worldwide search queries, we have:

Google: 88 billion per month

Twitter: 19 billion per month

Yahoo: 9.4 billion per month

Bing: 4.1 billion per month

90
80
70
60
50
Search queries(billion per
40
month)
30
20
10
0
Google Twitter Yahoo Bing

How web search engines work

A search engine operates, in the following order

• Web crawling

• Indexing

• Searching
First the web search engines retrieves the web pages with the help of software called Web
crawler it follows every link on the site. The contents of each page are then processed and it is
indexed. Index consists of data about web pages which is stored in an index database. A query can
be a single word or a group of words. Index allows to retrieve the information as quickly as possible.

When a user enters a query into a search engine, the engine first looks in its index and
provides a listing of best-matching web pages according to its criteria. Most search engines support
the use of the boolean operators AND, OR and NOT to further specify the search query. The engine
looks for the words or phrases exactly as entered. The usefulness of a search engine depends on
the relevance of the result it gives back. While there may be millions of web pages that include a
particular word or phrase, some pages may be more relevant than others. Most search engines
employ methods to rank the results to provide the "best" results first. The decision of which pages are
the best matches, and what order the results should be shown in, varies widely from one engine to
another.

Need of the hour

The main problem with the current search engines is the quality of the search results. When
we enter a search query what we get is a million guesses rather than one correct answer. With the
evolution of the search engine technology it is possible to improve the search results making them
more relevant and useful for the user. There are lots of ways that search will need to evolve in order
to easily meet user needs which includes : challenges of mobility, modes, media, personalization,
location, socialization, and language. It very exciting what search can achieve in the future.

Lets look at the various dimensions in which current search engines :

Personalization
Search engines of the future will be able to understand more about the user. The amount of personal
information to be disclosed will be at the sole discretion of the user; however it will not impose great
threat till the privacy of that information is maintained. The user information will allow the search
engines to give better search results. Knowledge of the location, what the user knows already or
what he had learned earlier , can help fully understand the user preferences. Access to user’s emails
and chat data can also be used to understand the user and his context.

Location
User location is a very useful piece of information. Location is relevant to a lot of
searches; user location will help understand the context of the search query in a
better manner, increasing the relevance and ease of search.

Social
The social circle of the user, consisting of his online friends and contacts will help the search engine
to discover relevant content from his social circle. The content from the friends and social contacts of
the user is likely to be more relevant to him than content from strangers. Analysis of the user’s social
graph can be used to further refine a query or disambiguate it.

Language
The information on the internet is in many languages. There are cases where an answer exists , but
not in a language we know. Translation can be used to solve this problem– the web search engine will
search for the information , translate it and bring it back in the language that we want..

Media
Instead of just having text search, search must include images, videos, news, books, and maps/local
information in the search results. Pictures, video and audio can be searched based on their actual
content analysed by the search engine rather than their labels which is the case now. The best media
could be chosen according to the query and the corresponding results be displayed to the user .

The Technologies being Developed

What are the technologies that have the potential to revolutionize the way our search engines
work. At the rate the progress is being made the search results we receive now will look as archaic in
a time of a few years.

Artificial Intelligence

Artificial intelligence(AI) is one of the most happening things as of now in the computer world
particularly due to the increased processing speeds and large store which is possible today. Artificial
intelligence can be used to attain semantic search by extracting specific facts, drawing inferences and
organizing those facts based on a few key words. Also another application of AI could be natural-
language processing by computer, allowing the search engine read text and understand the meaning
of that text. Other techniques in AI such as natural language synthesis, object recognition and
statistical machine learning will change the way we search.

Watson:

Watson is a supercomputer which is being currently developed by IBM. It is expected to be world’s


most advanced “question answering” machine, able to understand a question posed in natural
language and respond with a precise, factual answer. It would allow machines to converse more
naturally with people, letting them to ask questions instead of typing keywords. I.B.M. plans to sell
versions of Watson to companies in the next year or two. It will help take decisions in a small amount
of time based on analysis of all the data available, without the possibility of errors. It will be able to
answer questions faster and more accurately than most human beings.

Image search

Image search includes object recognition in images such as face detection, product detection etc.
Image search engines now is based on keywords, or text that is linked to a image in order to perform
web search. This can be unreliable, if the images lack sufficient descriptions. A new search engine,
Riya, looks inside the image to extract information about it using Artificial intelligence.Each image is
represented by 6,000 numbers and the search engine uses AI to match one visual signature to
another.

Voice search

Voice search will allow us to directly talk to a search engine, asking our queries to the search engine.
The search engine will then process the query and give back the result. One example of a voice
search engine is TalkTalk. TalkTalk gives a more accurate search result by interacting with the user to
understand the context and remove any ambiguity. It also evaluates and stores all the user given
replies and discussions, to give even more precise answers.

Metasearch engine

The concept of metasearch engines is completely different from conventional search engines.
A metasearch engine is a search tool that sends user query to several other search engines and then
aggregates the results into a single list. The main thinking behind this concept is that internet is too
large for any one search engine to index it all and better search results can be obtained by combining
the results from several search engines.

Some of the current innovative search engines

Grokker is a search engine that offers a better interface that groups search results
graphically, improving the way search results are displayed. Eurekster is a search engine that uses
the social networking elements to provides results that can be filtered based upon what members of
your social network are searching. Some of the other prominent ones are:

Viewzi

Viewzi provides various visual viewing options to the users to view their search results. This allows to
see the search results based on various categories, which can help the user find the information
faster.
SearchMe

SearchMe offers an advanced and intuitive interface. The results are displayed as a gallery of images
that allows the user to see the result pages without having to click-through. It also gives the users
the option to create stacks, or bundles of web pages saved for later.

Custom Search Engines

An example of custom search engine is Rollyo .Rollyo allows the user to create his own custom
search engine. Users can specify the sites in which they want that the search engine searches their
query. One particular use could be to search in ones bookmark list. The custom search can also be
shared with others, ie they can be private or public. They are unique and valuable search engines of
the future. They can be used to filter the websites depending upon our needs and interest.

All of these are very interesting and innovative. But the future of search, lies not in the hands
of these small companies but the large companies such as Google and Microsoft simply because they
have more better resources and easily surpass other search engines. It will be very difficult for
the small companies to make a big impact.
Economic aspect

The future of search, really looks very exciting. Search engine technology is still no where
close to where it can be. Still a lot of major changes are possible which will completely change the
face of web search. Its growth depends upon how much information and privacy the average user is
willing to give up. There are more than 8 million distinct websites and billions of individual Web pages,
thus to finding the required information is becoming increasingly challenging. Providers of information
and services know that their website is a key component of their business and that, in a crowded
information marketplace . It is important for the providers of information and services on the internet
that the searchers are able to find it using search engines.

Search engine advertising has become a very strong business and prospects for continued
growth are strong. “Web search now represents a significant portion of Web activity. Google
searches average 250 million searches per day, and the total daily number of Web searches is
estimated at well over 600 million. At least a portion of searching is for products or services that
the searcher will eventually purchase. Research has shown that higher-income users spend
more time on the Internet and buy more online. This marketplace of high-income earners is
intensely attractive to marketers and much harder to isolate in traditional media such as TV or

magazines.”- Rita Vine(http://findarticles.com/p/articles/mi_m0FWE/is_2_8/ai_114010257/)


Brand advertising works on the Web. Initially it was used as an alternative advertising medium
by only a few early adopters who placed ads on search engine pages. But now many advertisers,
including small businesses, are using the search engine pages for advertising . Thus the commercial
search engines are in advertising. They earn their revenue mainly by delivering relevant advertising
using a variety of means, but principally by selling search keywords to purchasers. However , Google
remains the only search engine which keeps paid results out of its main listings.

Commercial search engines require traffic and relevance to ensure ad placement success.
Traffic represents the number of Web users to a search site. There must be high traffic in order to
maximize the probability of conversion of some of that traffic will turn into a revenue-generating
activity. Relevance represents the capacity of the search engine to deliver meaningful results to
satisfy the user's keyword query. Relevance algorithms are used to determine the relevance. It is
important that the ad is relevant with respect to the search query otherwise the entire traffic will not
lead to even a single revenue generating activity. Relevance algorithms vary across different search
engines and are regularly tweaked in order to improve the user experience. Now when the search
engines deliver ads to search results pages, advertisers pay fees to the search engine for every ad
impression that is delivered.

The search engine advertising process starts with keyword buying. The advertiser purchases
or leases keywords or that he believes searchers will use when searching for specific products or
services. This enables the ad buyer to display a URL link when the searcher enters one or more of the
leased keywords into the search engine. Contracts may be based on a time period, or they may be
stipulate on the number of impressions that will be delivered. After the keyword has been purchased
there are two options:

In paid inclusion programs, search engines and their ad-feed partners guarantee that their
search engine will list pages from the advertiser's website in its index. But it does not guarantee a
rank high.

In Paid placement programs, a link to the advertiser's URL will be delivered in the search
results on a matched keyword or keywords as well as the rank of the link can also be bought. The
better the rank the more is the price. Location of the delivered link generally governs the fees, so
advertisers will pay more to be placed higher up the page in the search results.

How Paid Listings Affect Search Results

All the major search engines have, to a greater or lesser extent, embedded paid listings in
their main search results page, with the exception of Google, which separates the ad links completely
from its main search results. This generally leads to the degradation of the search results returned by
the search engine. Ad link results are generally separated from algorithmically generated results and
are accompanied by headers such as "Partner Sites" or "Sponsored Links.".The more commercial the
search keywords, the more likely the search is to produce paid listings.

Just like in case of traditional advertising, persistent viewing of paid listings inevitably creates
greater awareness of those paid listings and their brands. With greater awareness comes the
likelihood that those who create Web pages will link to those paid listings simply because they have
seen them many times and can remember them. All the search engines have the cumulative effect of
preferring what is popular. Thus the reach of paid placement extends even to pure search tools like
Google that rely on a link analysis algorithm for ranking. Moreover, as a larger number of popular sites
climb higher in search results, many excellent informational resources crawl even further down the list
of search results and entirely off the searcher's radar.

On the ethical front paid listings in search results, Google brilliantly established itself as a
trusted search tool. It plays both relevance and monetization sides of Web search in an inspired way.
It draws users to its search tool through finely tuned relevance and the promise of pure search results,
yet it is one of the largest ad agencies on the Web.

Future Trends:

The web search engines pay a large percentage of their revenues to other sites that use
However this makes sense only if multiple search engines provide equivalent search quality, so that
productivity remains the same, no matter what search engine they use. But this seems to be a
realistic assumption , since search engines can no longer afford to ignore search quality.

The Current Players

Google Search

Anyone and everyone who knows about internet knows about Google. Today google is the
most popular search engine on this earth. . Google search was originally developed by Larry
Page and Sergey Brin in 1997. For a search engine, the Web is consists a body of words on billions
of pages and hyperlinks that connect pages. Google was successfully able to link those words
efficiently, measuring relevance by the appearance of words on a page, and the number of hyperlinks
pointing to that page. Google Web Search is a web search engine owned by Google Inc. Google
receives several hundred million queries each day through its various services.
Google's success was in largely due to PageRank algorithm that helps rank web pages that
match a given search string. Previous keyword-based methods, used by other search engines would
rank pages by how often the search terms occurred in the page, or how strongly associated the
search terms were within each resulting page.

Google search provides at least 22 special features beyond the original word-search
capability. These include synonyms, weather forecasts, time zones, stock quotes, maps, earthquake
data, movie showtimes, airports, home listings, and sports scores. There are special features for
numbers, including ranges ,prices, temperatures, money/unit conversions, calculations, package
tracking, patents, area codes, and language translation of displayed pages.

Bing search engine

Bing is a web search engine from Microsoft. Bing was unveiled by Microsoft CEO Steve
Ballmer on May 28, 2009 in San Diego. As of October 2010, Bing is the 4th largest search engine by
query volume, at 3.25%, after its competitor Google at 83.34%, Yahoo at 6.32% and Baidu at 4.96%,
according toNet Applications.

Bing has innovative features like:

Image Search infinite scrolling ,A myriad of filtering options ,Video search preview, ClearFlow is a
mapping feature that offers up alternative routes when there's heavy traffic, Local search is very
comprehensive, Instant answers .

Ask.com

Ask is a search engine which was founded in 1996 by Garrett Gruener and David
Warthen in Berkeley, California. Three venture capital firms, Highland Capital Partners, Institutional
Venture Partners, and The RODA Group were early investors.Ask.com is currently owned
by InterActiveCorp under the NASDAQ symbol IACI.

Ask.com offers many innovative tools to helps the user to get the information he needs quickly and
easily. The features include:

Advanced Web Search, Basic Site Preferences, Local Search,

Conversions, Dictionary Search, Famous People Search, Maps & Directions, News Search, Image
Search, Popular Searches, Shopping Search, Smart Answer, Stock Search, Weather Search, White
Pages Search, Zoom Related Search-Narrow or broaden your search with possible alternative search
terms which appear on the right hand side of the Ask results page and Related Names- presents a list
of names that are conceptually tied to topic options within the "Narrow Your Search" and "Expand
Your Search" lists

Challenges

A very big challenge to building AI into a search engine is that it can be impractical on a large
scale. The computational power needed to calculate the required results efficiently can be enormously
expensive.

The people are not ready to give their personal information to search engines. If search
engine users gave up a little of their privacy and allowed their search habits to be monitored, this will
allow the search engines to provide better, customized results

With the amount of information already on the internet and the rate at which it is increasing it
is a challenge for the search engines to scale up to that level and provide relevant and meaningful
search results to the users.

Conclusion

Enter your desired search words into any of the search engines of today and the user often
ends up hoping that they display the type of results he is looking for. It is more like a "enter your query
and hope for the best" experience. An ideal search engine should be like a friend with instant access
to all the world’s facts and a photographic memory of everything the user has seen and knows. That
search engine could then give answers based on the preferences, the users existing knowledge and
the best available information. The search engine could ask for clarification and present the answers
in the media that worked the best. If there is a search engine where the user could just ask the
questions and get the answers in a much more rich way, then it will quickly become the dominant
search engine.
Within a few years, there will be next-generation search engines -- one that could extract
specific facts, draw inferences and organize those facts based on a few key words. The big change
that will happen in society is that instead of changing the human expressions and interactions into
what's easy for the computer, we'll improve computers' abilities to handle the expressions that are
natural for the human beings.

References

• “The Google Story by David A. Vise”

• http://www.internetworldstats.com/stats.htm

• http://en.wikipedia.org/wiki/Web_search_engine

• http://searchengineland.com/comscore-us-most-searches-china-slowest-34217

• http://googleblog.blogspot.com/2008/09/future-of-search.html

• http://www.nytimes.com/2010/06/20/magazine/20Computer-t.html?_r=1

• http://en.wikipedia.org/wiki/Google

• http://findarticles.com/p/articles/mi_m0FWE/is_2_8/ai_114010257/

“The Search: How Google and Its Rivals Rewrote the Rules of Business and Transformed Our
Culture” by John Battelle

Vous aimerez peut-être aussi