Vous êtes sur la page 1sur 6

Class Work

Web Technologies

Topic I
There are many Web technologies, from simple to complex, and explaining each in detail is
beyond the scope of this course.

Why HTML is Not a Programming Language


HTML is great. It defines the structure of web pages and it determines how data is displayed
online. What you’re looking at right now is HTML code, read and interpreted by your browser.
But this doesn’t make HTML a programming language.

A Markup Language

HTML is a type of markup language. It encapsulates, or “marks up” data within HTML tags,
which define the data and describe its purpose on the webpage. The web browser then reads the
HTML, which tells it things like which parts are headings, which parts are paragraphs, which
parts are links, etc. The HTML describes the data to the browser, and the browser then displays
the data accordingly.

That’s how the browser knows that

This is a heading
This is a paragraph, and

This is a link

However, this is not programming. The above is not an example of an executable script. The
HTML was only used in the above to mark up the text for the browser to read and interpret as
web page content. It told the browser which parts where headings, which were paragraphs, and
which were links, and the browser displayed them as such. HTML is used for structural purposes
on a web page, not functional ones.

Not a Programming Language

Programming languages have functional purposes. HTML, as a markup language doesn’t really
“do” anything in the sense that a programming language does. HTML contains no programming
logic. It doesn’t have common conditional statements such as If/Else. It can’t evaluate
expressions or do any math. It doesn’t handle events or carry out tasks. You can’t declare
variables and you can’t write functions. It doesn’t modify or manipulate data in any way. HTML
can’t take input and produce output. Think of it this way: you can’t compute the sum of 2 + 2 in
HTML; that’s not what it’s for. This is because HTML is not a programming language.

Still Awesome

Unfortunately, coding only in HTML doesn’t make you a programmer. In fact, HTML really
shines when you use it in conjunction with an actual programming language, such as when using
a web framework. That’s when you can start serving up dynamically created web pages and
database applications. But don’t worry, even with pure HTML, you’re still a coder. You’re
writing lines of code in a (markup, not programming) language. You’re essentially codifying
information for the web. So while you might not want to put HTML on the “Programming
Languages” part of your resume, you should definitely have it under “Skills”, or simply
“Languages”.

Knowledge of web page structure is a valuable asset for anyone to have, in IT as well as in other
fields, and I’m definitely not trying to discredit anyone’s knowledge on the awesomeness that is
HTML. HTML is a core tenet of front end web development and is obviously a major aspect of
what the user winds up seeing on their computer screen. With the emergence of HTML5,
HTML’s capabilities and opportunities to define and structure web page data have soared to new
heights, with a greater emphasis on multimedia, mobile web, geolocation, and more. This makes
a solid understanding of HTML even more useful to have. So keep rocking the HTML, get to
know it well, and by all means, code away! Just don’t call it programming, per se.

Still think HTML is a programming language? Think “programming” and “coding” is all just
semantics? Let me know in the comments.

Programs vs. markup


or why HTML authoring is not programming
The words "program" and "programming" are often used confusingly. This document tries to
characterize what computer programs and programming languages are and how they differ from
markup, both presentational and logical markup. This hopefully helps in understanding, for
example, the different roles of HTML and programming languages like JavaScript and Perl in
HTML authoring. The difference has some legal impact, too.

The simple (?) question: is HTML a programming language?

It is not uncommon to see people call HTML a programming language, or call HTML authoring
programming. In various classifications, HTML might be classified into a section titled
"Programming". Even documents purported to be HTML tutorials or references may say so.
However, few people are consistent in such usage and call HTML documents programs; this
might indicate that they don't really mean that HTML authoring is programming.

It's hard to tell what is behind this, but calling HTML a programming language might reflect the
use of the word "programming" to mean just writing something that will be processed by
computers rather than people; after all, we might (reasonably) say that writing HTML is coding.
Alternatively, perhaps the varying meanings of the word "program" in everyday language ("TV
program", "study program" etc.) confuse things. Or it might mean a misunderstanding caused by
the fact that HTML is used in conjunction with programming. In particular, an HTML document
might contain a program embedded into it, typically a JavaScript program, or an HTML
document might be generated by a program, typically a Perl program (script). People may miss
the essential distinction between HTML and constructs embedded into it. For example, the
HTML markup

<input name="x" size="30" style="width:100%" onclick="check()">

contains, as the value of the onclick attribute, a function invocation in a programming language.
The attribute itself is part of HTML syntax, but its value is something external to HTML, just as
image formats are. (Moreover, the markup contains an expression, width:100%, in yet another
language, a style sheet language. This is just to illustrate that HTML can be confused not only
with programming languages but with other things external to it.)

What HTML specifications say about the language

No HTML specification has ever called HTML a programming language, or anything like that.

There are somewhat different views on what HTML is, or should be. The first (!) HTML
specification, HTML 2.0 characterized the language as follows:

The HyperText Markup Language (HTML) is a simple data format used to create hypertext
documents that are portable from one platform to another. HTML documents are SGML
documents with generic semantics that are appropriate for representing information from a wide
range of domains.

That specification is a great improvement in conceptual clarity over its successors, HTML 3.2,
HTML 4 and XHTML 1.0. But none of those specifications calls HTML a programming
language. Instead, they say it is a markup language.

Let's see how the American National Standard Dictionary of Information Technology (ANSDIT)
defines what a markup language is:

markup

Text added to the data of a document to convey information about the document; for example:
tags, processing instructions, and hyperlinks.

markup language

(1) A text-formatting language designed to transform raw text into structured documents, by
inserting procedural and descriptive markup into the raw text. (2) A language designed to
describe or transform in space or time data, text, or objects into structured data, text, or
objects, for example: SGML, HTML, VRML.

So markup is, to put it briefly, information, not instructions. While it could contain "processing
instructions", in the SGML sense for example, these wouldn't really be comparable to
programming. As the HTML 4 specification mentions, effectively among SGML features not
supported by HTML 4 user agents, "Processing instructions are a mechanism to capture
platform-specific idioms". The examples there suggest things like font face and page eject, i.e.
quite comparable to presentational markup.

But doesn't markup mean instructions to computers?

Markup can be divided into two major categories: descriptive (or logical, or structural) markup,
which describes the structure of a document in some way, and procedural (or physical, or
presentational) markup, which specifies how the document should be presented physically.
Obviously, procedural markup is inevitably device-dependent in some sense, or at least
dependent on some general properties of the presentation medium. Page eject does not make
sense in speech. On the other hand, descriptive markup which e.g. divides a document into major
sections could be mapped to different presentations (say, page eject or pause or a divider rule or
image between sections).

For more on this, see Dmitri Kirsanov's Procedural and Descriptive Markup and the deeper explanation by Robin
Cover in SGML: A Textual Representation for Information Structure; Part 2: The Axiological Foundations of
SGML.

Neither descriptive nor procedural markup is programming, though procedural markup


might be somewhat comparable to programming in some respect. And HTML is essentially
descriptive; attempts to use it for procedural markup can have rather limited success only, no
matter how popular such attempts might be.

To compare with natural language constructs, "This car has four doors" is descriptive; "open all
the doors!" is procedural. Neither implies the other. And HTML constructs are generally
descriptive, saying things like "this is a heading", instead of saying "show this in such-and-such a
way". A browser can be programmed to process descriptive markup in a particular way. And for
obvious reasons, a browser generally displays headings in some emphatic way; and similarities
between browsers in this respect may lead people into thinking that markup like <h1>...</h1>
means some particular font size etc. - but it really doesn't. A browser might just as well be
programmed or configured to display first level headings in normal font but distinctive color, and
this might actually be better in a very small (handheld) device. To take another example, a
browser could present a link as underlined blue text that can be clicked on so that a new page
then appears. But that's just one possibility. Another possibility is that an indexing robot has been
programmed to follow all links in a document for indexing purposes (without anyone clicking or
displaying anything).

It is true that there are some (currently deprecated) constructs in HTML that can be regarded as
"commands" or "instructions" in a sense. One might say that <font color="red"> is an
"instruction" to turn font color to red. I wouldn't say so - it is more natural to interpret it as a
suggestion, or hint, concerning presentation - but if you do, then you might say that a browser is
an interpreter that executes such instructions. But that would be very remotely if at all analogous
to, say, a Perl interpreter executing a Perl program (script), which is written in a full-blooded
programming language.

Note that for example for the width attribute in HTML, the specifications explicitly say that it gives the
"suggested" or "recommend" width (of a table cell, for example). And experience shows that browsers actually treat
them that way, often overriding the values suggested by authors, sometimes for good reasons, sometimes not.

And the categorization of a language is to be judged according to its characteristic and typical
constructs. As a whole, HTML is clearly a markup language which is declarative ("here's a block
quotation... here's a heading, ... "), not procedural/imperative ("indent so-and-so", "increase font
size", ...). Even if we regarded and used HTML as a procedural markup language (and for such a
purpose, HTML is remarkably limited), this wouldn't make it a programming language or turn
HTML documents into programs. An MS Word document contains procedural markup - in a
specific binary format - for document appearance. (The use of binary format is not essential
here; we might just as well consider the RTF format, which is a physical markup language based
on textual tags.) If HTML documents were programs, MS Word documents would be that
much stronger - and I'm not even referring to macros. So would PDF documents, TeX
documents, nroff documents, etc. Even procedural markup is not programming; so surely
structural markup isn't either.

It might be argued that a presentational markup language is an interpreted programming


language. And it is true that the concepts "programming language" and "program" are somewhat
vague. A program in the strictest sense of the word, a binary program, is a sequence of machine
instructions directly executed by computer hardware. In a broader sense, a program could be a
"source program" written in a language like Fortran, C, or Cobol and compiled (translated) into a
binary program. In an even broader sense, we might dispense with the compilation, if there is a
program (in the strictest sense of the word!), an interpreter, that reads a "source" program and
executes it interpretively, i.e. performing the actions prescribed in the source program. This
makes things somewhat relative. The same source program could be "run" either via compilation
or by an interpreter. And since virtually anything can be interpreted, after assigning some
meaning to it, we could go to the extremes and say that, for example, any piece of text is a
program. After all, we could interpret the letter "a" as an instruction to print the letter "a" or, to
make it more exciting, the letter "b", or some image.

This reductio ad absurdum hopefully indicates that we need to draw the line between interpreted
programs and mere data structures somewhere. We might say that at the very minimum, a
programming language has some control structures for sequentiality, conditionality, and
repetition as well as some methods for storing and retrieving data during processing. Is there any
doubt of where markup languages belong then? In HTML, you cannot compute 1+1, or do much
branching, or repeat anything. HTML has tables, but only as static collections of data.
Is HTML coding, then?

It is reasonable to say that HTML markup is code (and writing HTML markup is coding),
provided that people understand that it is comparable to using coded notations when talking or
writing. Think about the use of product codes, or using special code books when sending
telegraphs, so that short coded presentations stand for long statements, or using colors as codes
so that red means "stop" or "warning" or "hot". It's a matter of using some notational system
which has been specifically agreed upon. (Actually, natural languages are not completely
different from codes; they too are based on agreements, just more vague and implicit.)

Since computer programs are often called "code" - we often say "source code" and "object code"
(i.e. program in machine language) - so care must be taken to avoid the idea that being code
means being program code. Even the phrase "source code" makes sense in conjunction with
markup language: it can be used to clarify that we refer to an HTML document as containing
markup, rather than the way it might be displayed (or spoken). Some people also say that HTML
is compiled, but this is quite misleading. It cannot be compiled in the sense that programming
languages are compiled. Without going into details here, let's just say this: it's a matter of putting
an HTML document as data into a browser or some other program and perhaps saving the
program in such a state for efficiency. It's packaging, not compilation.

Programs and data

Considering the distinction between programs and data, where does HTML markup fall into the
categorization? Since the markup applies to some textual data, isn't it program rather than data?

The categorization, though often useful, can be misleading. Programs are just a special case of
data - they can be processed in various ways, like copied onto diskettes, sent over the Internet,
etc., just as other data can. But programs are data that can be executed as machine instructions,
or executed in interpretive mode by an interpreter, or compiled into machine instructions by a
compiler. (We can of course decide to use the word "data" in a limited meaning too, as 'any data
which is not a program'.)

In particular, "data" (in the general sense, or as opposite to programs) does not mean only the
simple constituents like characters and numbers on which data processing operates at the low
level. Some confusion may have been caused by the use of the term "data character" in
conjunction with markup, denoting the plain text content of a document as opposite to those
characters which are part of markup. In the HTML element <h2>XYZ</h2>, only XYZ are data
characters in this sense while the rest constitutes markup. But this does not turn markup into
programs. It's comparable to writing a margin note "this is a 2nd level heading". Similarly,
markup like <ins>the</ins> is comparable to using brackets, i.e. [the], in some styles of
writing to indicate inserted text. Surely you don't do any programming if you put brackets around
words that you add into a quotation for clarity.

Write down a hundred times:


HTML is a data format, not a programming language.

Vous aimerez peut-être aussi