Vous êtes sur la page 1sur 7
‘The World-Wide Web (W3) was developed to be a pool of human knowledge, which would allow col- laborators in remote sites to share their ideas and all aspects of a com: mon project. Physicists and engi neers at CERN, the European Particle Physics Laboratory in « Swiverland, collaborate with many other institutes t0 build the software and hardware for high- ‘energy physics research, The idea of the Web was prompted by pos experience of a small “home-brew personal hypertext system used for keeping track of personal informa- tion on a distributed project. The Web was designed so that if it was used independently for wo proj ects, and later relationships were found bewween the projects, then no major or centralized chan would have t be made, but the information could smoothly re: shape to represent the new st knowledge. This property of sealing. has allowed the Web to expand rapidly from its origins at CERN across the Internet irrespective of boundaries of nations or disciplines. If you haven't yer experienced the Web, the best way to, is to wy it. An Appendix to this ar ives some recipes for getting hold of WS dents, Given one of th will quickly find out all you need to know, and much more, For hard copy to read on the plane, or if you don't have Internet access from your desktop machine, refer to our paper in Electronic Networking (see "Glossary and Further Reading") for view of the project, material which we will not repeat but will summarize here AW3 program your computer. When it plays an object a with text and possibly images. Some of the phrases and images are high: lighted: in blue, or boxed, or perhaps numbered, depending on what sort of a display yo preferences have been set. Clicking the mouse on the highlighted area nd out about it ‘dient ans on dis normally a docun have and how your 76 . we client program nother object fom some purer, a The re trieved object is normally also in a hypertext format, so the process of navigation continues (see Figure 1) When viewing some documents the reader can request a search, by typing in plain text (or complex cor mands) to send to the server, rather than following a link. In either the eli 1 sends a request off 10 the often a completely different machine in some other part of the world, and within (typically) a sec ond, the related information, in ei ther hypertext, plain text or snultime dia format, is presented. This is done repeatedly, and by a sequence of se lections and searches one can fi anything that is “out there.” Son important things to note are d = Whatever type of server, the user interface is the same, so users do not nd the differences ny protocols in cor mon use. Before W3, access 10 worked inform: typically in volved knowledge of many different access “recipes” for different systems, and a different command language for each. The model of hypertext with text input has proved sufficiently powerful 10 express all the user inter faves, while being sufficiently simple to require no training for a computer # Links can point w anything that n be displayed, including se result lists. (When a query is applied to an object, the resulting object has an address, defined to be the address of the queried object cone q with the text of the query. As the re sult object has an address, one can make links to it, Following the link later leads to a reevaluation of the query.) © While menus and directories are available, the extra option of hyper text provides a more powerful cor ‘munications tool. In simple cases, the server program can generat pertext view representing (for e ple) the directory structure of an ex isting file store. This allows existing 1a to be put “on the Web” without further human effort ‘There is a very extendable system ul for introducing new formats for timedia data © There are many W3 client pro: jams. AS hypertext information i transmitted on the logical (mark-up) form, each client can inter pret this in a way nat given platform, making optimal use Of fonts, colors, and other human in platform, work face resources available on that What Does W3 Define? WS has come to sta of things, which should be distin: guished. These include d for a number © The idea of a boundless informa: tion world in which all items. have a relerence by which retrieved: © The address. syst the project implem world possible, despite many differ nt protocols: #4 network protocol (HTTP) used by native W3 servers giving perfor jane and features not otherwise (URI) which nted ton available © Amarkup language (HTML) which every W3 client is required to under stand, and is used for the transmis fof basic things such as text help infor © The hody of data available on the Internet using all or some of the pre ceding listed items The client-server arch Web is illustrated in Figure 2. fecture of the Universal Resource Identifiers Universal Resource Identifiers! (URIs) are the strings used as ad "The Intent Engineering Task Force (ETF) is currently defining a sini and derived syntax Known ast Unilorm Resource Lacator (URL) Av this work fe not complete. and there m0 farantee that URL il have the same satan Sr properties URissmee the tem URE heve dresses of objects (e.g., menus, docu. ments, images) on the Web. For ex- ample, the URL of the main page for the WWW project happens to he bttp:/Jinfo.cern.ch/hypertext/ ‘WWW/TheProject.htm] URIs are “Universal” in that they encode members of the universal set of network addresses. For a new net- work protocol that has some concept ‘of object, one ean form an any object as the set of protocol pa rameters necessary to access the ob- ject. [these p. into a concise string, with a prefix to identify the protocol and encoding, fone has a new URI scheme. There © URIs for Internet news articles and newsgroups (the NNTP proto- col), and for FTP archives, for telnet destinations, email addresses, and so oon. The same can be done for names of objects in a given name space The prefix “huip” in the preceding example indicates the address space. and defines the interpretation of the restof the string. The HTTP protocol is to be used, so the string contains the address of the server 10 be con- nd a substring tobe passed to ddvess for meters are encoded e server, Different protocols use different syntaxes, but there isa small amount of common syntax. For ex- ample, the common URI syntax re~ serves the "/" asa way of representing, a hierarchical space, and "2" as a sep: arator between the address of an ob- Jject and a query operation applied to it. As these forms recur in several in= formation systems, to allow expres: sion of them in the common syntax allows the features to be retained in the common model, where appropri- ate. Hierarchical forms are useful for hypertext, where one “work” may be split up into many interlinked docu ments. Relative names exploit. the hierarchical structure and allow links to be made within the work indepen- dent of the higher parts of the URI such as the server name. URI syntax allows objects tbe addressed not only using HTTP, but also using the other common net- worked information protocols in use today (FTP, NNTP, Gopher, and WAIS), and will allow extension when new protocols are developed. URIs are central to the W3 archi- TThe fact that itis easy to ad dress an object_anywhere on the Internet is essential for the system to. scale, and for the information space to be independent of the network and server topology Hypertext Transfer Protocol Perhaps than being a protocol for transferring hy pentext, HTTP is a protocol for trans erring. information with the effi- ciency ‘making hypertext jumps. The data tra ferred may be plain text, hypertext, images, or anything else. When a user browses the Web, ob- jects are retrieved in rapid succession from often widely dispersed servers For small documents, the limitations to the response time stem mainly from the number of round trip delays across the network necessary before ition of the object can be started. HTTP is therefore a simple requesv/response protocol HTTP does not only transfer HTML documents. Although HTML comprehension is required of W3 cli- ents, HTTP is used for retrieving documents in an unbounded and ex: tensible set of formats. To achieve this, the client sends a (weighted) list of the for server replies with data in any of those formats that it can produce. This allows proprietary formats to be used between consenting. programs in private, without the need for stan dardization of those formats, This is important both for high-end users who. in sophisticated forms, and also as a hook for formats ned. The same negotiation system is used for natural language (English, French, for example) where available, as well as for compression forms. HTTP isan Internet protocol. It is similar in its readable, text-based style to the File Transfer (FTP) and Net work News (NNTP) Protocols that axe been used to transfer files and news on the Internet for many years. Unlike these protocols, however, HTTP, is stateless. (That is, it runs over a TCP connection that is held only for the duration of one opera- tion,) The stateless model is eff when a link from one object may lead. equally well © an object stored on the necessary for tats it can handle, and the share data that have yet to be inv FB sags 94/7, ot eommmantenmone eerie nese same server, or 10 another distant server. The purpose of a reference stich as a URI is that it should als refer to the fin some ser object. This also makes a stateless protocol appropriate, as it returns results based on the URI but itrel vant of any previous operations pe formed by the client The HTTP request from the cl starts with an operation code (known, as the method, in conformance with object-oriented terminology) and the URI of the object. The “GET” method used by all browsers is de: fined 10 be idempotent in that it should preserve the state of the Web (apart from billing for the informa: tion transfer, and statisties). "PUT method is defined for frontend up ind a “POST” method for the nent ofa new docum date, attach Web, or submission ofa filled: or other object to some processor Use of PUT and POST is currently limited, partly due to searcity of hy- pertext The other methods is a subject of study When objects are transferred over the network, information about the (’metainformation”) is transferred, HTTP headers. The set of headers is an extension of the Multipurpose Internet Mail Extensions (MIME) set This design decision was taken to ‘open the door to integration of hy permedia mail, news, and informa- tion access, Unlike transfer in binary, and transfer in nonstan: dard but mutually agreed document formats is possible. This allows, for ple, servers to indicate. links and titles of, documents (such tothe editors. extension to The convention that unrecognized HTTP headers and parameters are ignored has made it easy to try new ideas on working production servers. This has allowed the protocol defini: tion to evolve in a controlled way by the incorporation of tested ide: Hypertext Markup Language (HTML) Despite the ability of HTTP to negoti ate formats, W3 needed a common asic language of interchange for hypertext. HTML is that language, and much of the fabric of the Web is constructed out of it It was designed Anssiony sccnaclaay sian shies to be sulficiently simple so as to be easily produced by both people and programs, but also to adhere to the SGML standard in that a valid HTML document, if attached tw SGML dec including the HTML “DID,” may be parsed by an SGML. parser, HTML is a markup language that does not have t0 be used with HTTP. It can be used in ‘ext email {it is proposed as a for MIME), fe basic hypertext is needed. It includes simple structure elements such as several levels of headings, bul leted lists, menus and compact lists, all of which are useful when present ing choices, and in on-line docu: ments, Under development is a much en. riched version of HTML known has HTML+. This includes features for more sophisticated on-line documen: tation, form templates for the entry of themati rently many brows: by users, tables and cal forn Figure 1. Using the World-Wide Web. shown here is the authors’ pro- totype World-Wide Web application for NextStep machines. The appli- cation initially displays the user's “home” page (top) of personal notes and links (top). Clicking on underlined text takes the reader to new documents. in this case, the user visited the Virtual Library, and, in the high eneray physics department, founda link to CERN. Linked to CERN was the “Atlas” collaboration’s web including an engineer- Ing drawing (background). To save having to follow the same path again, the link menu (shown) allows a new link to be made, for example from text typed into the home page, directly to the Atlas information, ers support a subset of the HTML+ to the features in addition core HTML set HTML is defined to be a language of communication, which actually There is no requirement that files are stored in HTML. Servers may store other formats, or in. variations on HTML tion of local interest only ate HTML on the fly with each Fequest flows over the network that include extra informa and then W8 and Other Systems WAIS Machines Corporation Two. other (from Thinkin nd now WATS, Inc.) and Gopher (from the University of Minnesota), share WSs client-server architect of its: Funct cand a The WAIS protocol is influenced largely by the 239,50. protocol de- signed for networking library cata: logs. It allows a text-based search, commumenona or rem nen Aus 1iNili7 Noe 7D ‘Table 1. A comparison of three popular network information projects. Registered server figures taken April 27, 1993 and April 15, 1994. WAIS: from Thinking Machines Corporation directory, number of distinct hosts. Gopher: from “All the Gophers in the world register at the University of Minnesota. w3: from Geographical registry at CERN. In all cases many more servers exist which are not directly registered, so these are a very rough guide with no indication of quantity or quality oF Information at each host. wals Gopher World-Wide Web inal target Textased Campuswide Collaborative application information information work retrieval (ows) ‘Typical objects Te yes yes YES, Menus, Graphics NO YES: YES, Hypertext No NO YES, ‘Search functions Text search yes yes ves Relevance feedback YES NO NO Reterence to other NO YES: ves Registered servers April 1995 us 455 62 April 1994 137 lo 820 and retrieval following a search, In. dexes to be searched are found by searching in This two-stage search has been demon A master index: d to be sufficiently powerful to current world of WAIS Phere are no navigational tools cover the data {0 allow the reader to be shown the available however, or guided through the data: the reader is “parachuted in” toa hopefully re vant spot in the information world, bur left without context Gopher provides a free text search principally menus, A menu is alist of titles, from which the user may pick one, While gopher space is in fact a web contain ing many loops, the menu system ives the user the impression of a The Veronica server provides a aster index for gopher space The W3 data model is similar to the gopher model, except that menus are generalized to hypertext docu ments, In both cases, simple file serv- cers generate the menus or hypertext directly from the file structure of a WS. hyper gives the program more power 10 communicate the option: the reader, as it can include headings server. ‘The xt model and various forms of list structure, for example, within the hypertext. All three systems allow for the pro: vision of graphics, sound and video. although because the WAIS system only has access by text search, text has, to be associated with graphics files to allow them to be found, WS clients provide access to servers, of all types, as a single simple inter face to the whole Web is considered Important. Unknow user, several protocols are in use be- hind the scenes. A common code lie brary “libwww” put into the public domain hy CERN has promoted this uniformity very to the Whereas one would not + proliferation of protocols, the existence of more than ‘one protocol probably’ allows for the wish to. see most rapid progress during this phase in the development of the field, Tealso allows a certain limited confi dence that, if an architecture can en compass older systems and allow transition to current systems, it will by induction, be able 10 provide a transition t newer and better ideas as they are invented. Recent W3 Developments This article, like others in this issue was derived from material written in April 1998 for the INET'93 confer- Growth of the Web since that time has been so great that this sec tion has been completely rewritten: There are 829 (May: L248) rather than 62 registered TTP serv- crs, and many more client programs wailable as then, | prototype W client was ‘wysitryg” hypertext browser/editor NeXT Step, line mode browser, and were encour wing the developments of a good browser for X workstations. One year yo, NCSA’s Mostic W3 browser was in wide use on X- workstations. Its The init usin, We developed a wy installation and use was a major on for the spread of the Web. Today there are many browsers avail- able for workstations, Macintosh and. IBM/PC compatible machines, and for users with character-based termi nals. OF the latter category from the University of Kansas pro- Lyn vides full-screen access to the Web for users with ¢ ninals or emulators running on personal com puters, Since new software is appear ing frequently, readers are advisedl 10 check the lists on the Web for those most suited to their needs The availability of browsers and the availability of quality information have provoked each other. One avail able indicator of growth has been Merit Inc.'s count of the wallic of var ious dillerent protocols across. the NSF ‘TS backbone Figure 3) An indicator of the uptake rate of clients is the load on the info.cern.ch. WS server at CERN, which provides Web itself an doubled every 4 in the US. (ee information about the which more d months over the three years between April 1991 and April 1994 Information providers have abo. blossomed. Some of these provide simple overviews of what is available at particular institutes or in particular fields WS model to provide a virtual world Others use the power of the of great richness, Examples of servers that use hypertext in interesting ways, RAL-Durham Particle Data and the Legal Information In stitute’s hypertests of several great are the base tomes of American law, Franc Hoesel’s hypertest version of the Vat- iean’s Renaissance Culture exhibit at the Library of Congress set an exam ple that was followed by many collec tions of ant, history and other fields The Palo Alto town hall runs a server with everything from buildin lations to restaurants. As an example of the increasing use of the Web for commerce, a user-friendly virtual clothing store prompts for one’s size ing only those clothes that ave the Fight size and also in stock, The Future The WS initiative occupies the meet ing point ny fields of technol ogy. Users put pressure and effort into bringing about the adoption of WS in new areas, Apart from being a place of communication and learn ing, and a new market place, the Web is a show ground for new develop ion technology Some of the developments that we look forward to in the next few years. include # The implementation of vice that will allow documents t0 be referenced by name, independent of their location: © Hypertest editors allowing nones: users to make hypertext links to anize published information. ‘This will bring the goal of computer supported collaboration closer, with frontend update, and annotation; # More sophisticated document type definitions providing for the needs of Terminal emulator PC or Macintosh Unix xi Figure 2. The World-Wide Web client-server architecture. For pub: lished information to be universally available, W3 relies ona common addressing syntax, a set of common protocols, and negotiation of data formats, 10 Terabytes 1 Terabyte 100 Gigabytes 10 Gigabytes 1 Gigabyte 100 Megabytes 10 Megabytes gait 9301 9303 9305 9307 9309 9311 9401 9403 Figure 3. Traffic in bytes per month across the NSF T backbone in the US. File Transfer Protocol (FTP) was traditionally used to access archives of software. FTP uses separate connections for controland. data flow. WAIS arose as an interface to text retrieval systems, Gopher protocol with menu-style interfaces, and W3's HTTP with hypertext and multimedia. W3 clients handle many protocols to access all these worlds of data as a seamless continuum, but new W3 servers use HTTP by preference. Each vertical division represents a tenfold increase in traffic, The horizontal divisions are months. Data: Merit = ftp://ttp. merit.edu/statistics/nsfnet Glossary and Further Reading FTP: File Transfer Protocol. Postel, J. and Reynolds, J. File Transfer Protocol Internet RFC 959, October 1985, Gopher: The internet Gopher. Ankiesaria,F. et. al. The internet Gopher Protocol Internet RFC 1436, March 1998. HTML: Hypertext Markup Language. Berners:Lee, T. and Connolly, D. Hypertext “Markup Language Protocol. MIME: Multipurpose Internet Mall Extensions. Borenstein, N., and Freed, [MIME (Multipurpose internet Mall Extensions): Mechanisms for Specifying and Describing the Format of internet Message Bodies. internet REC 154, June 1992. NNTP: Network News Transfer Protocol. Kantor, 8. and Lapsley. P. A proposed standard for the transmission of news. Internet RFC 977, 1986. URE Universal Resource Identifier. Berners-Lee, T. Universal Resource Identifiers {for the World-Wide Web, Submitted as an internet RFC as yet unnumbered. See - for point: ers to Information on this area, WAIS: Wide Area Information Servers. See Addyman. T. WAI: Strengths, Weak: nesses and Opportunities. n Proceedings of information Networking 95 (Lon- ‘don, May 1993), Meckler, London, WS: Berners-Lee, TJ, Calllau, R, Crof, JF, Pollermann, B, World-Wide Web: The Information universe. Electronic Networking: Research, Applications and Policy, (Spring 1982), 52-58, See also documents in and information referenced by publishers of on-line © The development ofa common for mat for hypertext links from two- and three-dimensional more exciting interface possibilities; = Int ation with concurrent edi tors and other realtime features such as teleconferencing and virtual reality # Fasystonuse servers for low-end publication of information by and individuals * Evolution of objects from being groups principally human-readable docu ments 10 contain more machine oriented semantic information, allow ing more sophisticated processing: © Conventions on the charging and commercial use to allow direct access to for-profit services Internet for Conclusion It iy intended that after reading this article you will WS is, where it fits in with other sys tems in the field, and where it is going. is much more said, providing informatio is described ave an idea of what There to be about especially but on the Web itself about the Web” ted research and development work Also in the "Web are lists of contrib- and ideas, and pointers to work in progress, so that those interested can work together The Web does not yet meet its de- sign goal as being a pool of knowl cege that is as easy to update as 10 That level of immediacy of knowledge sharing waits for easy-to- ly available on most platforms. Mest in formation has in fact passed through ps tise hypertext editors to be genera plishers or system managers of one nother ible diversity of inforn However, the incred: ailable wives great credit to the creativity and ingenuity of information providers, and points to a very exciting future Appendix. Getting Started if you have a vtt00 terminal, you can try ‘out a fullscreen interface by telnet to tukanaix ccukans.edu and logging in as ww, With any terminal, ou can telnet to Infocern.ch for the simplest interface “Tnese browsers are also available n source ‘and in some cases binary form. Detalls of status and coordinates of about 20 aitfer: ‘ent browsers are available on the Web— Just follow a link to World-Wide Web, and select “software avaiable ‘The kernel WS code (a common code I brary, and basic server and clients) from CERN sin tne public domain. allprotocols, land specifications are public domain) itis 2vallaple by anonymous FTP from in- focernch INCSA's “Mosaic” browser For Ws Is aval! able for x, Mac or PC/Windows by anony: mous FTP from fepnesaulucedu, cur rently without charge for academic users. About the Authors: TIM BERNERS-LEE originated the World-Wide Web in 1990 10 enable the shating of knowledge by complex dist mis, APCERN he coordinates WS development by collaborating with insti ited te: tutes around the world. Current research interests include text processing, graphics, communications solewate, and system de sigh. email: timbl@ info.cern.ch ROBERT CAILLIAU coordinates the use of WS by CERN experime physics institutes. He ss long-time user of HyperCard, and as been working on W3. since 1991, contributing many ideas, and some software for the Macintosh, email illiau@www.cern.ch ARI LUOTONEN is. member of CERN’S technical student program in conjunction with his studies at Tampere University of Technology, Finland. Current research interests inchide developing CERN'S Tnuipd” HICTP server for Unix and VMS systems, email: hnotonen@ wwvecern.ch HENRIK FRYSTYK NIELSEN, of al borg University, Denmark, is also a CERN technical student, He is working on the Kernel cole, with research interests in cenhuaneed networking protocols. email frystyk@info.cern.ch ARTHUR SECRET wrote the frst gate way giving WS access to a relational data hase in 1992, while studying Computer Science at Feole Internationale des Sec fences dit Traitement de Tnformation in Paris, France, as a CERN technical stu dent. Among other tasks in the CERN WS team, he currently organizes the catalog ing of new WS material in the "virtual li brary Authors’ Present Address: CERN. 2, Sita Geneva and Fermisson wo copy without fee all or par of this ‘materi gamed provided that the copies ate fo made on dstrvtedl for dine commen ilvantage the ACMI copyright notice and the fie of the publication and Hs date sppear, and tunic iy giten that copying fs by permission of the Aswcuion for Computing Machinery. To ener, o ula, requires fee mndor specie permiion.

Vous aimerez peut-être aussi