Vous êtes sur la page 1sur 39

159.

334 Computer Networks

HTTP Protocol

Professor Richard Harris


School of Engineering and Advanced
Technology (SEAT)
Objectives
159.334 Computer Networks

To gain an understanding of HTTP protocol, the various


modes of operation of the HTTP protocol, and the
different HTTP messages and elements used during
HTTP requests and HTTP responses.

Computer Networks - 1/2


1/2
References
159.334 Computer Networks

Computer Networks by Andrew S. Tanenbaum


Chapter 6 of 4th Edition
Data Communications and Networking by Behrouz A.
Forouzan
Chapter 23 of 4th Edition
Stallings, William 2000 ‘Data and Computer
Communications’, Prentice Hall, Sixth Edition
Russell, Travis 1997 ‘Telecommunications Protocols’,
McGraw Hill
1997 ‘Internetworking with TCP/IP on Windows NT 4.0’,
Microsoft Press
Slides and slide extracts from Forouzan’s book
Computer Networks - 1/3
1/3
Presentation Outline
159.334 Computer Networks

Introduction to HTTP

Computer Networks - 1/4


1/4
Hypertext Transfer Protocol: HTTP
159.334 Computer Networks

Hypertext: set of documents in which a given document


can contain text, graphics, video and audio clips as well
as embedded reference to other documents,
Underlying protocol of the World Wide Web
Not a protocol for transferring hypertext
For transmitting information with efficiency necessary for
hypertext jumps
Can transfer plain text, hypertext, audio, images, and
Internet accessible information

Computer Networks - 1/5


1/5
Major components of a Web browser
159.334 Computer Networks

Computer Networks - 1/6


1/6
Components of a Web browser
159.334 Computer Networks

Controller
Clients
Interpreter

Computer Networks - 1/7


1/7
Controller
159.334 Computer Networks

Central piece of the browser


Interprets both mouse clicks and keyboard input and
call other components to perform operations specified
by the user
E.g. when a user enter a URL or clicks on a hypertext reference,
the controlloer calls a client to fetch the requested document
from the remote server on which it resides and interpreter to
display the document for the user

Computer Networks - 1/8


1/8
Interpreter
159.334 Computer Networks

HTML interpreter handles layout details by translating


HTML (HyperText Markup Language) specifications into
commands that are appropriate for the user’s display
hardware to display documents
Input to the HTML interpreter consists of a document that
conforms to the HTML syntax
Output consists of a formatted version of the document for the
user
Other interpreter can include XML (eXtensible Markup
Language) interpreter, etc.

Computer Networks - 1/9


1/9
HTML – HyperText Markup Language

159.334 Computer Networks

Computer Networks - 1/10


1/10
HTTP and its Port
159.334 Computer Networks

The Hypertext Transfer Protocol (HTTP) is a protocol used


mainly to access data on the World Wide Web. HTTP
functions as a combination of FTP and SMTP.

HTTP uses the services of TCP on


well-known port 80.

Computer Networks - 1/11


1/11
A selection of common HTML tags.
Some can have additional parameters.
159.334 Computer Networks

Computer Networks - 1/12


1/12
Handling of Tags
159.334 Computer Networks

Computer Networks - 1/13


1/13
Beginning and ending tags
159.334 Computer Networks

Computer Networks - 1/14


1/14
The Formatted Page
159.334 Computer Networks

Computer Networks - 1/15


1/15
Clients
159.334 Computer Networks

An application program that establishes connection for


the purpose of sending requests
In addition to HTTP client, other clients, say, an FTP
client can be included in the browser

Computer Networks - 1/16


1/16
Types of Web Documents
159.334 Computer Networks

The documents in the WWW can be grouped into three broad


categories: static, dynamic, and active. The category is
based on the time at which the contents of the document
are determined.

Computer Networks - 1/17


1/17
Example of Static Web Document
159.334 Computer Networks

Computer Networks - 1/18


1/18
Example of Dynamic Web Document
159.334 Computer Networks

Computer Networks - 1/19


1/19
Example of Dynamic Web Document
159.334 Computer Networks

Computer Networks - 1/20


1/20
Dynamic Document using CGI
159.334 Computer Networks

Computer Networks - 1/21


1/21
Dynamic document using server-site
script
159.334 Computer Networks

Computer Networks - 1/22


1/22
Active document using Java applet
159.334 Computer Networks

Computer Networks - 1/23


1/23
Active document using client-site
script
159.334 Computer Networks

Computer Networks - 1/24


1/24
Overview Of Browser Documents
159.334 Computer Networks

Static documents use html and xhtml etc.


Dynamic documents needs a programme running at
server side, e.g. request for current date and time from
server. Common Gateway Interface (CGI) technology is
used to handle the dynamic documents.
Active documents needs program to be run at client
side. The server carry a copy of program in binary and
send it to client on request, who will compile it using
Java or some other high level language.

Computer Networks - 1/25


1/25
HTTP Overview – 1
159.334 Computer Networks

Transaction oriented client/server protocol


Usually between Web browser (client) and Web server
Uses TCP connections
Stateless
Each transaction treated independently
Each new TCP connection for each transaction
Terminate connection when transaction complete

Computer Networks - 1/26


1/26
HTTP Overview – 2
159.334 Computer Networks

As already implied, the most typical use of HTTP is


between a Web browser and a Web server.
In a typical scenario, a new TCP connection is created
between client and server for each transaction and then
terminated as soon as the transaction completes.
Note that HTTP does not specify this one-to-one
relationship between transaction and connection
lifetimes.

Computer Networks - 1/27


1/27
HTTP Operation
159.334 Computer Networks

Typically there are three examples of HTTP operation:


Direct connection
Intermediate systems
A cache

Computer Networks - 1/28


1/28
HTTP Operation – Direct Connection
159.334 Computer Networks

Request Chain
TCP Connection
Response Chain
User Agent
Origin Server

This is the simplest case, in which a user agent or client


(e.g., a Web browser) establishes a direct connection
with the origin server (e.g., Web server).
First, the client opens an end-to-end TCP connection
between the client and server.

Computer Networks - 1/29


1/29
HTTP Operation – Direct Connection
159.334 Computer Networks

Then the client issues a request that consists of a URL


and a MIME-like message containing request
parameters, information about the client, and perhaps
some additional content information.
When the server receives the request, it attempts to
complete the request and returns an HTTP response
containing status information, a success/error code, and
a MIME-like message containing information about the
server, information about the response itself, and
possible body content.
The TCP connection is then closed.

Computer Networks - 1/30


1/30
HTTP Operation – Intermediate
Systems
159.334 Computer Networks

Request Chain
A B C

Response Chain
User Agent
Origin Server

In this scenario, there are one or more intermediate


systems with TCP connections between logically
adjacent systems.
Each intermediate system acts as a relay, so that a
request that is initiated by the client is relayed through
the intermediate systems to the server, and the
response from the server is relayed back to the client.

Computer Networks - 1/31


1/31
HTTP Operation – Intermediate
Systems
159.334 Computer Networks

There are three forms of intermediate systems defined in


the HTTP specification:
Proxy
Gateway
Tunnel

Computer Networks - 1/32


1/32
Intermediate Systems - Proxy
159.334 Computer Networks

HTTP request HTTP request

HTTP request over


authenticated connection

Inter-
TCP connection Proxy mediary

User Agent
Origin Server

A proxy acts on behalf of other clients and presents


requests from other clients to a server.
The proxy acts as a server in interacting with a client
and as a client in interacting with a server.

Computer Networks - 1/33


1/33
Intermediate Systems - Proxy
159.334 Computer Networks

There are two scenarios that call for the use of a proxy:
Security intermediary: the client and server may be separated by a
security intermediary such as a firewall with the proxy is on the client
side of the firewall. Typically, the client is part of a network secured by
a firewall and the server on is external to the secured network. To set
up a connection with the proxy, the server has to authenticate itself with
the firewall. The proxy accepts responses after they have passed
through the firewall.
Different versions of HTTP: if the client and server are running different
versions of HTTP, then the proxy can implement both versions and
perform the required mapping.

Computer Networks - 1/34


1/34
Intermediate Systems - Gateway
159.334 Computer Networks

1. HTTP request 1. non-HTTP request


2. HTTP request that must be filtered 2. Authenticated HTTP request

Intermediary
TCP connection Gate-
way

User Agent
Origin Server

A gateway is a server that appears to the client as if it


were an origin server.
It acts on behalf of other servers that may not be able to
communicate directly with a client.

Computer Networks - 1/35


1/35
Intermediate Systems - Gateway
159.334 Computer Networks

There are two scenarios that call for the use of a gateway:
Security intermediary: the client and server may be separated by a
security intermediary such as a firewall with the gateway on the server
side of the firewall. Typically, the server is connected to a network
protected by a firewall, with the client external to the network. In this
case, the client must authenticate itself to the proxy, which can then
pass the request on to the server.
Non-HTTP server: useful when Web browsers need to contact servers
for protocols other than HTTP, such as FTP. The client makes an HTTP
request to a gateway server. The gateway server then contacts the
relevant FTP server to obtain the desired result. This result is then
converted to a form suitable for HTTP and transmitted back to the
client.

Computer Networks - 1/36


1/36
Intermediate Systems - Tunnel
159.334 Computer Networks
HTTP request

TCP connection Tunnel

User Agent
Origin Server

The tunnel performs NO operations on HTTP requests and responses.


It is a relay point between two TCP connections.
Tunnels are used when there must be an intermediary between the
client and server, e.g., a firewall in which a client or server external to a
protected network can establish an authenticated connection and then
maintain that connection for purposes of HTTP transactions.

Computer Networks - 1/37


1/37
HTTP Operation - Cache
159.334 Computer Networks
Request
Chain

A B C

Response
User Agent Chain
Origin Server

A cache is a facility that may store previous requests


and responses for handling new requests.
If a new request arrives that is the same as a stored
request, then the cache can supply the stored response
rather than accessing the resource indicated by the
URL.

Computer Networks - 1/38


1/38
HTTP Operation - Cache
159.334 Computer Networks

The cache can operate on a client or server or an


intermediate system other than a tunnel.
Not all transactions can be cached, and a client or server
can dictate that a certain transaction may be cached
only for a given time limit.

Computer Networks - 1/39


1/39