Académique Documents
Professionnel Documents
Culture Documents
The internet is a network of networks that connects computers all over the world. The
internet has its root in the U.S. military, which funded a network in 1969, called the ARPANET
(Advance Research Project Agency), to connect the computers at some of the college and
universities where military research took place. As more computers connected, the ARPANET
was replaced by the NSFNET, which was run by the National Science Foundation Network. By
the late 1990’s, the internet had shed its military and research heritage and was available for use
by the general public. Internet service Provider (ISP’s) began offering dial-up Internet accounts
for a monthly fee, giving users access to e-mail, discussion groups, and file transfer. In 1989, the
World Wide Web (an Internet based system of interlinked pages of information) was born, and in
the early 1990’s, the combination of e-mail the web and online chat propelled the internet into
national and international prominence.
Computers connected to the internet communicate by using the Internet Protocol (IP), which
slices information onto packages (chunks of data to be transmitted separately) ands routes them
to their destination. One definition of the Internet is all the computers that pass packets to each
other by using IP. Along with IP, most computers on the Internet communicate with
Transmission Control Protocol (TCP), and the combination is called TCP/IP.
1960s Telecommunications
Essential to the early Internet concept was packet switching, in which data to be transmitted
is divided into small packets of information and labeled to identify the sender and recipient. The
packets were sent over a network and the reassembled at their destination. If any packet did not
arrive or was not intact, the original sender was requested to resend the packet. Prior to packet
switching, the less efficient circuit switching method of data transmission was used. In the early
1960s, several papers on packet switching theory were written, laying the groundwork for
computer networking as it exists today.
ARPANET, 1969
In 1969, Bolt, Beranek, and Newman, Inc., (BBN) designed a network called the Advance
Research Projects Agency Network (ARPANET) for the United Stated Department of Defense.
The military created ARPA 1 to enable researchers to share “Super computing” power. Initially
only 4 nodes (or Hosts) comprised the ARPANET. They were located at the University of
California at Los Angeles, the University of California at Santa Barbara, the University of Utah,
and the Stansford Research Institute. The ARPANET later became known as the Internet.
1970s Telecommunications
In this decade, the ARPANET was used primarily by the military, some of the larger companies,
such as IBM, and Universities for e-mail. The general population was not yet connected to the
system and very few people were on-line at work.
The use of local area networks (LANs) became more prevalent during the 1970s. Also, the idea
of an open architecture was promoted; that is, networks making up the ARPANET could have
any designs. In later years, this concept had a tremendous impact on the growth of the
ARPANET.
1972
By 1972, the ARPANET was international, with nodes in Europe at the University College in
London, England, and the Royal Radar Establishment in Norway. The number of nodes on the
network was upto 23, and the trend would be for that number to double every year from then on.
Ray Tomlinson, who worked at BBM, invented e-mail.
UUCP, 1976
AT&T Bell Labs developed UNIX to UNIX copy (UUCP). In 1977, UUCP was distributed with
UNIX.
USENET, 1979
User Network (USENET) was started by using UUCP to connect Duke University and the
University of North Carolina at Chapel Hill. Newsgroups emerged from this early development.
1980s Telecommunications
In this decade, Transmission Control Protocol/Internet Protocol (TCP/IP), a set of rules
governing how networks making up the ARPANET communicate, was established. For the first
time, the term “Internet” was being used to describe the ARPANET. Security became a concern,
as viruses appeared and electronic breaking occurred.
The 1980s swathe Internet grow beyond being predominantly research oriented to including
business applications and supporting a wide range of users. As the Internet became larger, the
Domain Name Systems (DNS) was developed, to allow the network to expand more easily by
assigning names to host computers in a distributed fashion.
CSNET, 1980
The Computer Science Network (CSNET) connected all University computer science
departments in the United States. Computer science departments were relatively new, and only a
limited number existed in 1980. CSNET joined the ARPANET in 1981.
BITNET, 1981
The Because It’s Time Network (BITNET) formed at the City University of New York and
connected to You University. Many mailing lists originated with BITNET.
TCP/IP, 1983
The United States Defense Communications Agency required the TCP/IP be used for all
ARPANET hosts. Since TCP/IP was distributed at no charge, the Internet became what is called
an open system. This allowed the Internet to grow quickly, as all connected computers were now
“speaking the same language.” Central administration was no longer necessary to run the
network.
NSFNET, 1985
The National Science Foundation Network (NSFNET) was formed to connect the National
Science Foundation’s five super computing centers. This allowed researchers to access the most
powerful computers in the world, at a time when large, powerful, and expensive computers were
a rarity and generally inaccessible.
The Internet Worm (created by Robert Morris while he was a computer science graduate student
at Cornell University) was released. It infected 10 percent of all Internet hosts. Also in this year,
Internet Relay Chat (IRC) was written by Jarkko Oikarinen.
NSF took over control of the ARAPNET in 1989. This change over went unnoticed by nearly all
users. Also, the number of hosts on the Internet exceeded the 100,000 mark.
1990s Telecommunications
During the 1990s, lots of commercial organizations started getting on-line. This stimulated the
growth of the internet like never before. Graphical browsing tools were developed, and the
programming language HTML allowed users all over the world to publish on what was called
the World Wide Web. Millions of people net online to work, shop, bank and be entertained. The
Internet played a much more significant role in society, as many non technical users from all
walks of life got involved with computers.
GOPHER, 1991
Gopher was developed at the University of Minnesota. Gopher allows to fetch files on the
internet using a menu based system.
The World Wide Web (WWW) was created by Tim Berners –Lee at CERN ( a French acronym
for the European Laboratory for Particle Physics), as a simple way to publish information and
make it available on the Internet.
WWW, 1992
The interesting nature of the Web caused it to spread, and it became available to the public in
1992.
Mosaic, 1992
Mosaic, a graphical browser for the Web, was released by Marc Andreesen and several other
graduate students at the University of Illinois. Mosaic was first released under X Windows and
graphical UNIX.
The company called Netscape Communications, formed by Jim Clark, released Netscape
Navigator, a Web browser that captured the imagination of everyone who used it.
Yahoo!, 1994
Stansford graduate students David Filo and Jerry Yang developed their Internet search engine
and directory called Yahoo!
Java, 1995
The Internet programming environment, Java, was released by Sun Microsystems, Inc. This
language, originally called Oak, allowed programmers to develop Web pages that were more
interactive.
The software giant committed many of its resources to developing its browsers, Microsoft
Internet Explorer, and Internet applications.
Hyper Text Transfer Protocol
HTTP Protocol
The Hypertext Transfer Protocol (HTTP) is an application-level TCP/IP based protocol with the
lightness and speed necessary for distributed, collaborative, hypermedia information systems
(internet).
HTTP Overview
A browser is works as an HTTP client because it sends requests to an HTTP server which is
called Web server. The Web Server then sends responses back to the client. The standard and
default port for HTTP servers to listen on is 80 but it can be changed to any other port like 8080
etc.
There are three important things about HTTP of which you should be aware:
HTTP is connectionless: After a request is made, the client disconnects from the server
and waits for a response. The server must re-establish the connection after it process the
request.
HTTP is media independent: Any type of data can be sent by HTTP as long as both the
client and server know how to handle the data content. How content is handled is
determined by the MIME specification.
HTTP is stateless: This is a direct result of HTTP's being connectionless. The server and
client are aware of each other only during a request. Afterwards, each forgets the other.
For this reason neither the client nor the browser can retain information between different
requests across the web pages.
Following diagram shows where HTTP Protocol fits in communication:
Like most network protocols, HTTP uses the client-server model: An HTTP client opens a
connection and sends a request message to an HTTP server; the server then returns a response
message, usually containing the resource that was requested. After delivering the response, the
server closes the connection.
The format of the request and response messages is similar and will have following structure:
Initial lines and headers should end in CRLF. More exactly, CR and LF here mean ASCII values
13 and 10.
The initial line is different for the request than for the response. A request line has three parts,
separated by spaces:
GET is the most common HTTP method. Other methods could be POST, HEAD etc.
The path is the part of the URL after the host name. This path is also called the request
Uniform Resource Identifier (URI). A URI is like a URL, but more general.
The HTTP version always takes the form "HTTP/x.x", uppercase.
Initial Line: Response
The initial response line, called the status line, also has three parts separated by spaces:
HTTP/1.0 200 OK
or
Header Lines
Header lines provide information about the request or response, or about the object sent in the
message body.
The header lines are in the usual text header format, which is: one line per header, of the form
"Header-Name: value", ending with CRLF. It's the same format used for email and news
postings, defined in RFC 822.
A header line should end in CRLF, but you should handle LF correctly.
The header name is not case-sensitive.
Any number of spaces or tabs may be between the ":" and the value.
Header lines beginning with space or tab are actually part of the previous header line,
folded into multiple lines for easy reading.
User-agent: Mozilla/3.0Gold
or
An HTTP message may have a body of data sent after the header lines. In a response, this is
where the requested resource is returned to the client (the most common use of the message
body), or perhaps explanatory text if there's an error. In a request, this is where user-entered data
or uploaded files are sent to the server.
If an HTTP message includes a body, there are usually header lines in the message that describe
the body. In particular:
HTTP Methods
The GET method means retrieves whatever information (in the form of an entity) is identified by
the Request-URI. If the Request-URI refers to a data-producing process, it is the produced data
which shall be returned as the entity in the response and not the source text of the process, unless
that text happens to be the output of the process.
A conditional GET method requests that the identified resource be transferred only if it has been
modified since the date given by the If-Modified-Since header. The conditional GET method is
intended to reduce network usage by allowing cached entities to be refreshed without requiring
multiple requests or transferring unnecessary data.
The GET method can also be used to submit forms. The form data is URL-encoded and
appended to the request URI
A HEAD request is just like a GET request, except it asks the server to return the response
headers only, and not the actual resource (i.e. no message body). This is useful to check
characteristics of a resource without actually downloading it, thus saving bandwidth. Use HEAD
when you don't actually need a file's contents.
The response to a HEAD request must never contain a message body, just the status line and
headers.
A POST request is used to send data to the server to be processed in some way, like by a CGI
script. A POST request is different from a GET request in the following ways:
There's a block of data sent with the request, in the message body. There are usually extra
headers to describe this message body, like Content-Type: and Content-Length:
The request URI is not a resource to retrieve; it's usually a program to handle the data
you're sending.
The HTTP response is normally program output, not a static file.
The most common use of POST, by far, is to submit HTML form data to CGI scripts. In this
case, the Content-Type: header is usually application/x-www-form-urlencoded, and
the Content-Length: header gives the length of the URL-encoded form data. The CGI script
receives the message body through STDIN, and decodes it. Here's a typical form submission,
using POST:
home=Mosby&favorite+flavor=flies
If you were writing a CGI script directly i.e. not using PHP, but Perl, Shell, C, or another
language you would have to pay attention to where you get the user's value/variable
combinations. In the case of GET you would use the QUERY_STRING environment variable
and in the case of POST you would use the CONTENT_LENGTH environment variable to
control your iteration as you parsed for special characters to extract a variable and its value.
POST Method:
Your form data is attached to the end of the POST request (as opposed to the URL).
Not as quick and easy as using GET, but more versatile (provided that you are writing the
CGI directly).
GET Method :
Your entire form submission can be encapsulated in one URL, like a hyperlink so can
store a query by a just a URL
You can access the CGI program with a query without using a form.
Fully includes it in the URL: http://myhost.com/mypath/myscript.cgi?
name1=value1&name2=value2.
Header lines provide information about the request or response, or about the object sent in the
message body.
Allow
The Allow entity-header field lists the set of methods supported by the resource identified by the
Request-URI. The purpose of this field is strictly to inform the recipient of valid methods
associated with the resource.
Example
Authorization
The Authorization field value consists of credentials containing the authentication information of
the user agent for the realm of the resource being requested.
Example
Authorization : credentials
Content-Encoding
The Content-Encoding entity-header field is used as a modifier to the media-type. When present,
its value indicates what additional content coding has been applied to the resource, and thus what
decoding mechanism must be applied in order to obtain the media-type referenced by the
Content-Type header field. The Content-Encoding is primarily used to allow a document to be
compressed without losing the identity of its underlying media type.
Example
Content-Encoding: x-gzip
Content-Length
The Content-Length entity-header field indicates the size of the Entity-Body, in decimal number
of octets, sent to the recipient or, in the case of the HEAD method, the size of the Entity-Body
that would have been sent had the request been a GET.
Example
Content-Length: 3495
Content-Type
The Content-Type entity-header field indicates the media type of the Entity-Body sent to the
recipient or, in the case of the HEAD method, the media type that would have been sent had the
request been a GET.
Example
Content-Type: text/html
Date
The Date general-header field represents the date and time at which the message was originated,
having the same semantics as orig-date in RFC 822.
Example
Expires
The Expires entity-header field gives the date/time after which the entity should be considered
stale. This allows information providers to suggest the volatility of the resource, or a date after
which the information may no longer be valid.
Example
From
The From request-header field, if given, should contain an Internet e-mail address for the human
user who controls the requesting user agent. The address should be machine-usable, as defined
by mailbox in RFC 822.
Example
From: webmaster@w3.org
If-Modified-Since
The If-Modified-Since request-header field is used with the GET method to make it conditional:
if the requested resource has not been modified since the time specified in this field, a copy of
the resource will not be returned from the server; instead, a 304 (not modified) response will be
returned without any Entity-Body.
Example
Last-Modified
The Last-Modified entity-header field indicates the date and time at which the sender believes
the resource was last modified.
Example
Location
The Location response-header field defines the exact location of the resource that was identified
by the Request-URI. For 3xx responses, the location must indicate the server's preferred URL for
automatic redirection to the resource. Only one absolute URL is allowed.
Example
Location: http://www.w3.org/hypertext/WWW/NewLocation.html
Pragma
The Pragma general-header field is used to include implementation-specific directives that may
apply to any recipient along the request/response chain. All pragma directives specify optional
behavior from the viewpoint of the protocol; however, some systems may require that behavior
be consistent with the directives.
Example
Referer
The Referer request-header field allows the client to specify, for the server's benefit, the address
(URI) of the resource from which the Request-URI was obtained.
Example
Referer: http://www.w3.org/hypertext/DataSources/Overview.html
Server
The Server response-header field contains information about the software used by the origin
server to handle the request. The field can contain multiple product tokens and comments
identifying the server and any significant subproducts.
Example
User-Agent
The User-Agent request-header field contains information about the user agent originating the
request. This is for statistical purposes, the tracing of protocol violations, and automated
recognition of user agents for the sake of tailoring responses to avoid particular user agent
limitations.
Example
WWW-Authenticate
Example
2. Define Protocol.
A computing, protocol is a formal description of digital message formats and the rules for
exchanging those messages in or between computing systems and in telecommunications.
Protocols may include signaling, authentication and error detection and correction capabilities. A
protocol describes the syntax, semantics, and synchronization of communication and may be
implemented in hardware or software, or both.
Computer Networks: Computers are connected generally in the same physical location, using
different styles e.g. Token ring, Star, Serial connection, etc. Also called as LAN, Local Area
Network.
Distributed Systems: Also can be considered as a type of computer network but in a much large
scale. The key distinction is that in a distributed system, a collection of independent computers
appears to its users as single coherent systems. Usually, it has a single model or paradigm that it
presents to the users. In computer networks, this coherence, model, or software are absent. Users
are exposed to the actual machines, without any attempt by the system to make the machine look
and act in a coherent way.
The World Wide Web, abbreviated as WWW and commonly known as the Web, is a system of
interlinked hypertext documents accessed via the Internet. With a web browser, one can
view web pages that may contain text, images, videos, and other multimedia and navigate
between them via hyperlinks.
W3C Stands for the World Wide Web Consortium. W3C was created in October 1994 by Tim
Berners-Lee (W3C was created by the Inventor of the Web).
W3C is working to make the Web accessible to all users (despite differences in culture,
education, ability, resources, and physical limitations). W3C also coordinates its work with many
other standards organizations such as the Internet Engineering Task Force, the Wireless
Application Protocols (WAP) Forum and the Unicode Consortium.