Vous êtes sur la page 1sur 30

XHTML & CSS I

Chapter 1
Introduction to HTML and XHTML
Presented by Thomas Powell
Slides adopted from
HTML & XHTML: The Complete Reference, 4th Edition
2003 Thomas A. Powell

XHTML & CSS I

Markup and (X)HTML


What is a markup language?
Essay markup
Special symbols that indicate what to do or how to present
Older word processing and typesetting (WordStar, .troff, etc.)
HTML/XHTML are the not-so-behind the scenes markup
languages that are used to tell Web browsers (user agents)
hot to structure and, some may say, display Web pages.
HTML Hypertext Markup Language
XHTML Extensible Hypertext Markup Language
Simple difference, XHTML is stricter version of HTML based
upon the rules of XML-well see more later.

XHTML & CSS I

Markup and (X)HTML Contd.

XHTML & CSS I

Markup Quickstart
HTML document is a structured text document composed of
elements, entities and text fragments
<b>This is important text! &copy; 2002</b>
Markup elements are made up of a start tag (e.g. <strong>) and
might include an end tag that contains a closing slash character
(e.g. </strong>).
The browser applies the meaning of the element to the enclosed
content.
Under traditional HTML some elements are emptythey enclose no
content and thus they have no close tag (e.g. <hr>). In XHTML all
tags close so we use <hr></hr> or more appropriately <hr />.

XHTML & CSS I

Markup Quickstart Contd.


The start tag of an HTML element may contain attributes that modify
the meaning of the tag.
In traditional HTML some attributes effected a tag simply by their
existence <hr noshade>
Under XHTML attribute values are always required so <hr
noshade=noshade /> would be the correct XHTML syntax.
However even under traditional HTML most attributes have a value
<p align=center>
Attribute values should always be quoted with either single or
double quotes.
Style wise double quotes tends to be more common
Traditional HTML allowed quotes to be removed on ordinal
attribute values so <p align=center> was allowed.

XHTML & CSS I

Markup Quickstart Contd.

XHTML & CSS I

HTML 4 Transitional Full Example


<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">

<html>
<head>
<title>First HTML Example</title>
</head>
<body>
<h1>Welcome to the World of HTML</h1>
<hr>
<p>HTML <b>really</b> isn't so hard!</p>
<p>You can put in lots of text if you want to. In
fact, you could keep on typing and make up more
sentences and continue on and on.</p>
</body>
</html>

XHTML & CSS I

XHTML 1.0 Transitional Full Example


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml"
lang="en">
<head>
<title>First XHTML Example</title>
</head>
<body>
<h1>Welcome to the World of XHTML</h1>
<hr />
<p>XHTML <b>really</b> isn't so hard!</p>
<p>You can put in lots of text if you want to. In
fact, you could keep on typing and make up more
sentences and continue on and on.</p>
</body>
</html>

XHTML & CSS I

Example Overview

The preceding example uses some of the most common elements found in (X)HTML
documents:
The <!DOCTYPE> statement indicates the particular version of HTML or XHTML
being used in the document. In the first example, the transitional 4.01
specification was used, while in the second the transitional XHTML 1.0
specification was employed.
The <html>, <head>, and <body> tag pairs are used to specify the general
structure of the document. Notice that under XHTML you need to have a little
more information about the language you are using.
The <title> and </title> tag pair specifies the title of the document that generally
appears in the title bar of the Web browser
.
The <h1> and </h1> header tag pair creates a headline indicating some
important information.
The <hr /> tag, which has no end tag making its syntax different in XHTML,
inserts a horizontal rule, or bar, across the screen.
The <p> and </p> paragraph tag pair indicates a paragraph of text.

XHTML & CSS I

Your First Example


Notice that (X)HTML files are just text file so you can
type it in using Notepad, SimpleText, etc.
Type in the previous
example and the file as
first.html or first.htm
Open file with browser using
Open file and the document
will display in the browser
window.

XHTML & CSS I

Your First Example


If you make a mistake type in again
and reload the document either using
Open file or pressing the reload
button in the browser.
Make sure you do not save the file
as first.txt or use format like .doc
otherwise the browser may render
the content on screen.
Also be aware that the browser will
cache pages!

Do the example yourself!

XHTML & CSS I

Example Wrap-up
From the previous example you might surmise that learning
(X)HTML is merely a matter of learning the multitude of markup
tags, such as <b>, <i>, <p>, and so on, that specify the format
and/or structure of documents to browsers.
This is partially true but like knowing how Microsoft Word
commands works does not make one a writer.
It should be obvious from the proceeding example that creating
(X)HTML in such a manual fashion is not appropriate.
Well study tools to produce markup in a bit, but regardless of the
tool being used to create a page we should know how markup
works.

XHTML & CSS I

(X)HTML: A Structured Language

HTML has a very well-defined syntax and all HTML documents should
follow a formal structure.

The World Wide Web Consortium (www.w3.org) defines the HTML and
XHTML standards.

HTML was defined as an application of the Standard Generalized Markup


Language (SGML)
SGML is a language to define other languages a meta or grammar
language if you like
In SGML you define a Document Type Definition or DTD which
represents the grammar or rules of the language being defined.

In 1999 the definition of HTML was rewritten using XML (Extensible Markup
Language) and renamed XHTML.
In XML you also may use a DTD but an emerging grammar form called
a schema can also be used.

XHTML & CSS I

(X)HTML: A Structured Language

Looking at the XHTML specification (http://www.w3.org/TR/xhtml1/) you


see the DTD defines the grammar of the language. Here is a small
excerpt:
<!ELEMENT html (head, body)>
<!ATTLIST html
%i18n;
xmlns %URI; #FIXED
'http://www.w3.org/1999/xhtml' >

In this fragment we see the definition of the root element html which
encloses a head element followed by a body element and the html
element has an xmlns attribute as well as something called %i18n which
is just a macro that expands to some more attributes such as lang and
dir which specify aspects of the language in use.

Reading the DTD we can define the structure of an HTML or XHTML


document as shown on the next few slides.

XHTML & CSS I

HTML Structure
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">

<html>
<head>
<title>Document Title Goes Here</title>
...Head information describing the document and providing
supplementary information goes here....
</head>
<body>
...Document content and markup go here....
</body>
</html>

XHTML & CSS I

XHTML Structure
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml" lang="en">


<head>
<title>Document Title Goes Here</title>
...Head information describing the document and providing
supplementary information goes here....
</head>
<body>
...Document content and markup go here....
</body>
</html>

XHTML & CSS I

Document Types
All documents begin with a <!DOCTYPE> declaration.
In the basic sense it identifies the HTML dialect used in a document by
referencing an external DTD.
A DTD defines the actual elements, attributes, and element
relationships that are valid in documents.

Modern browsers are aware of the <!DOCTYPE> and will examine it


to determine what rendering mode to enter (standards vs. quirk).
This process is often dubbed the doctype switch

Using the <!DOCTYPE> declaration allows validation software to


identify the DTD being followed in a document, and verify that the
document is syntactically correctin other words, that all tags used
are part of a particular specification and are being used correctly.

XHTML & CSS I

Document Types Contd


A <!DOCTYPE> statement often looks like

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

or

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"


"http://www.w3.org/TR/html4/loose.dtd">
or

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"


"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

Notice that the later examples are more appropriate and provide the
actual URL to the DTD in question

XHTML & CSS I

Common HTML Doctypes


HTML Version

!DOCTYPE Declaration

2.0

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">

3.2

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">

4.0 Transitional

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"


"http://www.w3.org/TR/html4/loose.dtd">

4.0 Frameset

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN"


"http://www.w3.org/TR/html4/frameset.dtd">

4.0 Strict

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"


"http://www.w3.org/TR/html4/strict.dtd">

4.01 Transitional

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"


"http://www.w3.org/TR/html4/loose.dtd">

4.01 Frameset

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN"


"http://www.w3.org/TR/html4/frameset.dtd">

4.01 Strict

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"


"http://www.w3.org/TR/html4/strict.dtd">

XHTML & CSS I

Common XHTML Doctypes


XHTML Version

Doctype

XHTML 1.0 Transitional

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"


"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

XHTML 1.0 Strict

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"


"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

XHTML 1.0 Frameset

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN"


"http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">

XHTML 1.1

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"


"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

XHTML 2.0 (still in progress)

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 2.0//EN" "


http://www.w3.org/TR/xhtml2/DTD/xhtml2.dtd ">

XHTML & CSS I

HTML Version Summary


HTML Version

Description

2.0

Classic HTML dialect supported by browsers such as Mosaic. This form of


HTML supports core HTML elements and features such as tables and forms
but does not consider any of the browser innovations of advanced features
such as style sheets, scripting, or frames.

3.0

The proposed replacement for HTML 2.0 that was never widely adopted, most
likely due to the heavy use of browser-specific markup.

3.2

A version of HTML finalized by the W3C in early 1997 that standardized most
of the HTML features introduced in browsers such as Netscape 3. This
version of HTML supports many presentation elements, such as fonts, as well
as early support for some scripting features.

4.0 Transitional

The 4.0 transitional form finalized by the W3C in December of 1997 preserves
most of the presentation elements of HTML 3.2. It provides a basis for
transition to CSS as well as a base set of elements and attributes for multiple
language support, accessibility, and scripting.

4.0 Strict

The strict version of HTML 4.0 removes most of the presentation elements
from the HTML specification, such as fonts, in favor of using Cascading Style
Sheets (CSS) for page formatting.

4.0 Frameset

The frameset specification provides a rigorous syntax for framed documents


that was lacking in previous versions of HTML.

4.01 Tran/Strict/Frame

A minor update to the 4.0 standard that corrects some of the errors in the
original specification.

XHTML & CSS I

XHTML Version Summary


XHTML Version

Description

1.0 Transitional

A reformulation of HTML as an XML application. The transitional form


preserves many of the basic presentation features of HTML 4.0 transitional
but applies the strict syntax rules of XML to HTML.

1.0 Strict

A reformulation of HTML 4.0 strict using XML. This language is rule


enforcing and leaves all presentation duties to technologies such as
Cascading Style Sheets (CSS).

1.1

A minor change to XHTML 1.0 that restructures the definition of XHTML


1.0 to modularize it for easy extension. It is not commonly used at the time
of this writing and offers minor gains over XHTML 1.0.

2.0

A new implementation of XHTML circa 2003 that may not provide


backward compatibility with XHTML 1.0 and traditional HTML. XHTML 2 will
likely remove most or all presentational tags left in HTML and will introduce
even more logical ideas to the language.

Given there are numerous versions of HTML and XHTML it is important


to know which browsers support what technologies. A brief overview is given on
the next few slides.

XHTML & CSS I

<html> tag
Looking deeper at the document we see the <html> tag
delimits the beginning and the end of an HTML document.
Given that <html> is the common ancestor of an HTML
document it is often called the root element, as it is the root of
an inverted tree structure containing the tags and content of a
document.
The <html> tag, however, directly contains only the <head>
tag, the <body> tag, and potentially the <frameset> tag
instead of the <body> tag.
Interestingly <html> is not required under standard HTML

XHTML & CSS I

<head> and <body>

The head of a document delimited by <head> includes supplementary


information about the document including document title, scripts, styles, meta
information, etc.

Most important head element is <title>


<title> is mandatory under even older HTML specifications
<title> should be the first tag in the <head> under traditional HTML and
must be the first tag under XHTML
<title> used for bookmarking, navigation, searching, etc.
<title> will not render markup -- <title><b>Yow!</b><title>
The title may however contain entities <title>PINT &copy; 2003</title>

The <body> of a document contains the actual content and appropriate markup
to render the page

There should be only one head section (<head>) and one body section
(<body>) in a document.

Under old HTML, both <head> and <body> are actually optional

XHTML & CSS I

Within the <body>

HTML follows a content enclosure model of large structures containing


smaller structures.
Within the body you have block-level elements which define structural content
blocks like paragraphs (<p>) or headings (<h1>).
Within block structures we see inline elements like bold (<b>), emphasis (<em>)
and so on as well as straight text content and entities such as &lt; or &#060;
which insert the < symbol.
Typically block elements create formatting boxes and cause returns and inline
elements do not cause returns.

Further structures like lists (<ul>), images (<img>), scripts (<script>) and
multimedia objects (<object>) are also found in the <body> but may fall
outside the hierarchy you might expect.

The concept of tags enclosing only certain types of other tags is dubbed the
content model.

XHTML & CSS I

The Rules of (X)HTML

HTML is not case sensitive, XHTML is


<b>, <B> are the same under HTML
<p ALIGN=center> and <p align=center> are also the same
XHTML forces lowercase so always use lowercase even in HTML

HTML/XHTML attribute values may be case sensitive


Mostly related to URL values
<img src=test.gif> is the same as <img SRC=test.gif> under HTML
<img src=test.gif> may not be same as <img src=TEST.GIF>

(X)HTML is sensitive up to a single white space character


<b>This is a test</b>

renders the same as <b> This is

test</b>
Under some elements like <pre> or <textarea> whitespace rules may be different
Lack of whitespace understanding can create visual problems and result in
wasted bandwidth.

XHTML & CSS I

The Rules of (X)HTML Contd.

(X)HTML elements should be nested not crossed


Nested = Good
Crossed = Bad

<b><i>This is bold and italic </i></b>


<b><i>Dont do this </b></i>

(X)HTML follows a content model


Some tags are only allowed in others
Example, unordered lists (<ul>) should only contain list items (<li>) so
the common markup <ul><p>test</p></ul> is actually illegal.

Elements should have close tags unless empty


<p> should have </p> even though under HTML it is optional
Empty element should self-close (e.g. <hr />)

Unused elements may minimize


<p></p><p></p><p></p> is not going to do what you expect

XHTML & CSS I

The Rules of (X)HTML Contd.

Attributes should be quoted


Under standard HTML ordinals did not have to be <p align=center>
Under XHTML you always quote <p align=center>

Browsers ignore unknown attributes and elements


<bogus>Look at me!</bogus>
<p today=Tuesday>What!?</p>

Despite all these rules you find browsers allow just about anything to
render.
Beware: Tag soup HTML common or not does not lend itself to
maintenance and is not futureproof!
With the rise of XHTML we do actually need to know what is going on!

XHTML & CSS I

Major Themes
Logical and Physical markup
Logical markup says what something means, physical markup
describes how something looks.
<b> is physical markup and <strong> is logical markup.
What is <p>, <h1>, <head>, <body>?
Standards vs. Practice
Question: How do most people think about HTML?
Answer: Physically
Consider a WYSIWG editor, does it encourage logical markup?
Consider the value of logical markup but be pragmatic!

HTML is the English of the Web -- poorly spoken but well


understood

XHTML & CSS I

Myths about (X)HTML


(X)HTML is a WYSIWYG design language
(X)HTML is a programming language
Traditional HTML is going away soon
XHTML will take the public by storm
XHTML is not useful
Hand coding of (X)HTML is always the way to go
(X)HTML is all you need to know

Vous aimerez peut-être aussi