Vous êtes sur la page 1sur 32

Basic

Web Architecture Web Architecture Extensibility Other Transfer Protocols

This section describes the current architecture of the World Wide Web (WWW). The following sections describe : The basic two-tier architecture of the web in which static web pages (documents) are transferred from information servers to browser clients world-wide, Extensions that permit three-tiered architectures where content pages can be constructed dynamically and where programs as well as data can be transferred, Other information transfer protocols, and related standards.

Introduction

Web

Architecture Extensibility Other Transfer Protocols

The basic web architecture is two-tiered and characterized by a web client that displays information content and a web server that transfers information to the client. This architecture depends on three key standards: HTML for encoding document content, URLs for naming remote information objects in a global namespace, and HTTP for staging the transfer.

1) 2)

3)

The common representation language for hypertext documents on the Web. HTML had a first public release as HTML 0.0 in 1990, was Internet draft HTML 1.0 in 1993, and HTML 2.0 in 1994. HTML 3.0 and Netscape HTML are competing next generations of HTML 2.0. Proposed features in HTML 3.0 include: forms, style sheets, mathematical markup, and text flow around figures.

HTML is an application of the Standard Generalized Markup Language (SGML ISO-8879), an international standard approved in 1986, which specifies a formal meta-language for defining document markup systems. An SGML Document Type Definition (DTD) specifies valid tag names and element attributes. In addition, documents can be inter or intra linked by establishing source and target anchor points. Many HTML documents are the result of manual authoring or word processing HTML converters, but now several WYSIWYG editors support HTML styles

HTML files are viewed using a WWW client browser (software), the primary user interface to the Web. HTML allows for embedding of images, sounds, video streams, form fields and simple text formatting. References, called hyperlinks, to other objects are embedded using URLs . When an object is selected by a hyperlink, the browser takes an action based on the URL's type, e.g., retrieve a file, connect to another Web site and display a HTML file stored there, or launch an application such as an E-mail or newsgroup reader.

An addressing protocol for objects in the WWW There are two types of URIs, Universal Resource Names (URN) and the Universal Resource Locators (URL).

Both URNs (names) and URLs (locators) are URIs, and a particular URI may be a name and a locator at the same time. The URIs are part of a larger Internet information architecture which is composed of URNs, URLs. Each plays a specific role: URNs are used for identification, URLs for locating or finding resources.

A URN is like a person's name, while a URL is like their street address. The URN defines something's identity, while the URL provides a method for finding something. URNs are often compared to the ISBN system for uniquely identifying books (and in fact you can encode an ISBN as a URN). Having a book's unique identifier lets you discuss the book, such as whether you've read it, enjoyed it, etc. To actually read the book, however, you need its location.

An application-level network protocol for the WWW. , father of the Web, describes it as a "generic stateless object-oriented protocol."

In HTTP, commands (request methods) can be associated with particular types of network objects (files, documents, network services). Commands are provided for
establishing a TCP/IP connection to a WWW server, sending a request to the server (containing a method to be applied to a specific network object identified by the object's identifier, and the HTTP protocol version, followed by information encoded in a header style), returning a response from the server to the client (consisting of three parts: a status line, a response header, and response data), and closing the connection.

Introduction Basic

Web Architecture

Other Transfer Protocols

This basic web architecture is fast evolving to serve a wider variety of needs beyond static document access and browsing. The Common Gateway Interface (CGI) extends the architecture to three-tiers by adding a backend server that provides services to the Web server on behalf of the Web client, permitting dynamic composition of web pages. Helpers/plug-ins and Java/JavaScript provide other interesting Web architecture extensions.

CGI is a standard for interfacing external programs with Web servers. The server hands client requests encoded in URLs to the appropriate registered CGI program, which executes and returns results encoded as MIME (Multipurpose Internet Mail Extensions) messages back to the server. CGI programs are executable programs that run on the Web server. They can be written in any scripting language (interpreted) or programming language . Security precautions typically require that CGI programs be run from a specified directory (e.g, /cgi-bin) under control of the webmaster (Web system administrator), that is, they must be registered with the system.

Arguments to CGI programs are transmitted from client to server via environment variables encoded in URLs. The CGI program typically returns HTML pages that it constructs on the fly.

Client

HTTP REQUEST

Client

Server

HTTP REQUEST

HTTP REQUEST

Client

Server

CGI

CGI Application

HTTP REQUEST

HTTP REQUEST

Client

Server

CGI

CGI Application

MIME MESSAGE

HTTP REQUEST

HTTP REQUEST

Client

Server

CGI

CGI Application

HTTP HEADER, MIME MESSAGE

MIME MESSAGE

One way to send form data to a CGI program is by appending the form information to the URL, after a question mark. You may have seen URLs like the following: http://some.machine/cgi-bin/name.pl?fortune

Helpers/Plug-ins - When a client browser retrieves a file, it launches an installed helper application or plug-in to process the file based on the file's MIME-type . For example, it may launch a Postscript or Acrobat reader, or MPEG or QuickTime player. A helper application runs external to the browser while a plug-in runs within the browser.

Java/ JavaScript - Java is a cross-platform WWW programming language modeled after C++ from Sun Microsystems. Java programs embedded in HTML documents are called applets and are specified using <APPLET> tags. The HTML for an applet contains a code attribute that specifies the URL of the compiled applet file. Applets are compiled to a platformindependent bytecode which can be safely downloaded and executed by the Java interpreter embedded into the Web browser. Browsers that support Java are said to be Java-enabled.

JavaScript is a scripting language designed for creating dynamic, interactive Web applications that link together objects and resources on both clients and servers. A client JavaScript can recognize and respond to user events such as mouse clicks, form input, and page navigation, and query the state or alter the performance of an applet or plug-in. A server JavaScript can exhibit behavior similar to common gateway interface (CGI) programs. JavaScript scripts are embedded in HTML documents using <SCRIPT> tags. Similar to Java applets, JavaScript scripts are directly interpreted within the client's browser and are therefore platform-independent.

Introduction Basic

Web Architecture Web Architecture Extensibility

The Web also uses other HTTP-related protocols for transferring and representing information, including Transmission Control Protocol/Internet Protocol (TCP/IP) - the fundamental protocol that provides for the reliable delivery of streams of data from one host to another. File Transfer Protocol (FTP) - a common method of moving files between two Internet sites. It is based on TCP/IP. Secure Socket Layer (SSL) - a security protocol developed by Netscape for sending and receiving encrypted information. It is based on encryption technology.

Multipurpose Internet Mail Extensions (MIME) - the protocol for multimedia email and a building block of HTTP. The first packet of information received by a client identifies the type of file the server has sent, e.g., binary, audio, video, movie, formatted word-processor documents, graphics, spreadsheets, etc. When multimedia files are sent using the MIME standard they are encoded into non-readable text. The Web browser maintains a list of pairs of MIME-Types and helper applications for handling each type.