contents introduction reference index previous next

HTML 4 - User's Guide
The HTML global structure

Contents:
1. SAMPLES
1.1 A sample source page
1.1.1 The HTML page
1.1.2 The head section
1.1.3 The body section
1.2 Blocks, margins and padding
2. USAGE
2.0 Terminology
2.1 The HTML page
2.2 Document head
2.2.1 The HEAD element
2.2.2 Meta information
2.2.3 Profiles
2.2.4 The TITLE element
2.2.5 Other elements in the HEAD contents
2.3 Document body
2.3.1 The body contents and the BODY element
2.3.2 Inline and block elements
2.3.3 Data grouping: the DIV and SPAN elements
2.3.4 Headers: the H1, H2, H3, H4, H5, H6 elements
2.3.5 Paragraphs and lines: the P and BR elements
2.4 Common attributes
2.4.1 The id attribute
2.4.2 The class attribute
2.4.3 The title attribute
2.4.4 The style attribute
2.5 Comments
3. SYNTAX
3.1 The global structuring elements
3.2 The head contents elements

1. SAMPLES

1.1 A sample source page

Here is a sample page that contains the most usual elements to give the page a structure.
<HTML>
<HEAD>
  <META http-equiv="Content-Type" content="text/html">
  <META name="keywords" content="HTML, user's guide, tutorial">
  <TITLE>HTML GLOBAL STRUCTURE</TITLE>
</HEAD>
<BODY style="background:#DDEEFF">
<H2 style="color:#00A000">Gulliver's Travels</H2>
<DIV style="margin:1em; text-align:center"><DIV style="font-style:italic">Articles of 
Impeachment against Quinbus Flestrin,</DIV>(the Man-Mountain)<BR>ARTICLE I</DIV>
<P style="text-indent:20; margin-top:0.3em; text-align:justify">WHEREAS, by a Statute 
made in the Reign of his Imperial Majesty 
<SPAN style="font-style:italic;">Calin Deffar Plune</SPAN>, it is enacted, That whoever 
shall make water within the Precincts of the Royal Palace, shall be liable to the pains 
and Penalties of High Treason: Notwithstanding, the said <SPAN style="font-style:italic">
Quinbus Flestrin</SPAN>, in open Breach of the said Law, under cover of extinguishing 
the Fire kindled in the Apartment of his Majesty's most dear Imperial Consort, did 
maliciously, traitorously, and devilishly, by discharge of his Urine, put out the said 
Fire kindled in the said Apartment, lying and being within the Precincts of the said 
Royal Palace; against the Statute in that Case provided, etc. against the Duty etc.
</P>
<DIV style="font-weight:bold; color:red"  title="real author: Jonathan Swift 
(1667-1745)">Lemuel Gulliver</DIV>From 
<SPAN style="font-style:italic">TRAVELS into Several Remote Nations of the WORLD</SPAN>
</BODY></HTML>

To see how the page is rendered, please click here

1.1.1 The HTML page

The HTML page is enclosed between the <HTML> and </HTML> symbols. These are called the starting (or start) and the ending (or end) tag of the HTML element. What is contained between the starting and the ending tags makes up the contents of the element.

Internally, the page comprises 2 sections:

- the head section contained between the <HEAD> and </HEAD> symbols; to put it differently: it is the contents of the HEAD element
- the body section that makes up the contents of the BODY element.

1.1.2 The head section

In the example, the head section contains 3 elements:
- 2 META elements
- 1 TITLE element

The first META element has the attributes:
- http-equiv="Content-Type"
- content="text/html"
'http-equiv' and 'content' are the names of the attributes, 'Content-Type' and 'text/html' their values. This element defines the content type of the document as being text/html

The second META element has the attributes:

- name="keywords"
- content="HTML, user's guide, tutorial"
The 'name' attribute defines a property of the page that is given the name of 'keywords'. The 'content' attribute specifies the value of this property as 'HTML, user's guide, tutorial'. This element creates the list of 3 keywords HTML, user's guide, and tutorial. This list is intended for the search engines that roam the Internet.

The TITLE element has as contents the title: "HTML GLOBAL STRUCTURE". With the Internet Explorer and Netscape Navigator, as with most of the user agents, this title is displayed on the title bar.

1.1.3 The body section

The body section constitutes the contents of the BODY element. This is the part of the document that is rendered on the user's window.

It is a character text interspersed with HTML elements, contained between the <BODY> and </BODY> tags.

The present BODY element has the style="background:#DDEEFF" attribute. This attribute defines the background color of the body text.

Here are the elements included in the present BODY contents:

H2The H2 element contains a header, i.e. a text to be enhanced by the browser (you can see how) - this element has the style="color:#00A000" which defines the foreground, i.e. character color in the element
DIV The DIV element delimits a sequence of text (you see 3 instances of it). Such a sequence is rendered as a block starting on a new line, and ending with a line of its own (the subsequent text is on a new line.) The second instance (made to stand out in purple) is embedded in the first.
These instances of the DIV element have the style attribute which defines properties of the text contained in the element:
- the margin property defines the margin that surrounds the DIV block - 'em' is a length measure worth one line height: the first DIV block margin is 1 line; the 2nd DIV block has no margins (its sticks to the text above and below).
- the text-align property defines text alignment; the text in the first DIV block is centered, as you can see.
- the font-style property defines the font style of the text as 'italic'
- the font-weight property defines the font weight of the text in the block, as 'bold'
- the color property defines the text color in the block
The second block which is embbed in the first is also centered. It is said to inherit the alignment property from the embedding element. It does not inherit the margin property, so it has no margin (this behaviour is 'natural' enough if you think of it).
BRThe BR element causes a line break - it has no contents
P The P element delimits a paragraph. It has the same effect of isolating a sequence of text as the DIV element, except that by default, it sets an empty line above and below the block. You can create these lines for a DIV block with the margin property, or more specifically the margin-top and margin-bottom properties. Similarly, you can suppress the empty lines above and/or below a P block by using these properties (e.g. style="margin-bottom:0"). The element presently has the attributes:
. text-indentwhich causes the first line in the block to be indented by 20 pixels
. margin-topwhich defines the block top margin as 0.3 line height
. text-alignthat specifies that the text be justified, i.e. spaces be inserted between words so as to make the right margin well aligned.
SPANThe SPAN element delimits a sequence of text that is rendered in-line with the surrounding. Here the SPAN element is used to assign the italic font style to the enclosed text.

1.2 Blocks, margins and padding

<DIV style="text-align:justify; text-indent:20; background:#DDCC44">
<DIV style="width:30%; float:left; text-align:justify; font-size:85%; border:thin solid 
blue; margin:1em; padding:0.5em; background:#DDFFDD;color:blue;">This address was 
delivered by President Abraham Lincoln, at the dedication of the cemetery at Gettysburg. 
The Gettysburgh battle was fought from July 1st to July 3rd near the little village by 
that name, 35 miles southwest of Harrisburg, Pa. It claimed 23,000 casualties from the 
Federal forces and 20,000 from the Confederates. The Confederate invasion under 
Gen. Robert E. Lee was stopped there. This battle is generally considered the turning 
point of the war.</DIV>
<P style="text-align:center; font-weight:bold; color:#DD0000">
ADDRESS DELIVERED AT THE DEDICATION OF THE CEMETERY AT GETTYSBURG</P>
<DIV>Four scores and seven years ago our fathers brought forth on this continent, a new 
nation, conceived in Liberty, and dedicated to the proposition that all men are 
created equal.</DIV>
<DIV>Now we are engaged in a great civil war, testing whether that nation, or any nation 
so conceived and so dedicated, can long endure. We are met on a great battle-field of 
that war. We have come to dedicate a portion of that field, as a final resting place for 
those who here gave their lives that that nation might live. It is altogether fitting and 
proper that we should do this.</DIV>
<DIV>But, in a larger sense, we can not dedicate - we can not consecrate - we can not 
hallow - this ground. The brave men, living and dead, who struggled here, have 
consecrated it, far above our poor power to add or detract. The world will little note, 
nor long remember what we say here, but it can never forget what they did here. It is for 
us the living, rather, to be dedicated here to the unfinished work which they who fought 
here have thus far so nobly advanced. It is rather for us to be here dedicated to the 
great task remaining before us - that from these honored dead we take increased devotion 
to that cause for which they gave the last full measure of devotion - that we here highly 
resolve that these dead shall not have died in vain - that this nation, under God, shall 
have a new birth of freedom - and that government of the people, by the people, for the 
people, shall not perish from the earth.
</DIV>
<P style="font-weight:bold;">ABRAHAM LINCOLN</P>
November 19, 1863 
</DIV>
To see how the page is rendered, please click here

The entire text is enclosed in a DIV element (starting and ending tags standing out in purple here). This element has a style attribute that defines a number properties common to all the enclosed text, where not otherwise redefined:

align-text:justifythe text is justified (well aligned on the right margin)
padding:1ema padding with a thickness equal to 1 line height borders the block
type-indent:20the first line of the text is idented by 20 pixels.
background:#DDCC44the background is of a yellowish color

The first embedded DIV element has a style attribute that defines the following properties:

width:30%the element block is allocated 30% of the page width (or more acurately, 30% of the containing DIV block, but the latter extends over the entire page width, by default)
float:leftthe block is floated to the left and lets the surrounding text flow on its right
font-size:85%the font size is 85% of the surrounding text
border:bluethe text block is surrounded by a blue border
margin:1emthe block is surrounded by a margin with the thickness of 1 line height (the margin is on the outer side of the border)
padding:0.5emthe block is padded by a surrounding space with the thickness of 0.5 line height (the padding comes between the text and the border)
background:#DDFFDDthe background is of a greenish color
color:bluethe text characters are blue

The P element contains the document title : 'ADDRESS DELIVERED AT THE DEDICATION OF THE CEMETERY AT GETTYSBURG'. This element has the properties to center the text in the available space, and to render it in bold and red characters.

The essential part of the text is enclosed in three DIV elements. The DIV elements cause line breaks between these parts of the text. These elements without attributes of their own inherit all the properties from the enclosing DIV element, particularly the indentation. You can use the BR element to cause line breaks. But then only the very first line in the block is indented. A line that follows a BR element is not.

2. USAGE

2.0 Terminology

The HTML language uses markups to specify treatments to be carried out on a document.

The markups come in the form of tags that most often go by pair:
- a starting tag -- example: <DIV>
- an ending tag -- example: </DIV>
The ending tag is sometimes omitted

The starting and the ending tags, and the data between them constitute an element

The data between the starting and the ending tags are the contents of the element.

A starting tag has the following syntax, with the tag contents enclosed between acute brackets:

<tagname attributes>
The tag has a name that comes right after the < sign, without intervening spaces. This name is part of the HTML language.
Example: <P style="font-weight:bold"> -- P is the tag name.

This name is also called the type of the element.

The attributes which can be mandatory or optional, convey complementary information on the elements. They are to be found in the starting tag and have the form:

name="value"
where:
- name is the name of the attribute
- value is the value of the attribute
Example: style="font-weight:bold" -- 'style' is the attribute name -- 'font-weight:bold' is the attribute value.

Each element type has a set of attributes defined by the HTML language. Each attribute has a set of allowable values.
The quotes (single or double) around the value are not always required, but are recommended.

A ending tag has the following syntax:

</tagname>
The same tagname as appears in the starting tag is repeated, behind the </ symbol, without intervening spaces.

2.1 The HTML page

The whole of an HTML document is contained between an <HTML> and a </HTML> tag.

It is composed of 2 parts:

- a document head that contains general information on the document; this information controls the rendering of the document, but is not displayed.
- a document body displayed for the user to see.

2.2 Document head

The following is the syntax of the HEAD element with the elements that it can contain. These elements are presented with some (but not all) of their attributes. Complete description of each element is to be found elsewhere.
   <HEAD profile="uri" 
         lang="languageCode" dir="LTR|RTL">
      <TITLE>contents</TITLE>
      <BASE href="uri" target="frameTarget">
      <META http-equiv="name" name="name"
            content=cdata" scheme="cdata">
      <LINK rel="linkType"  type="contentType"            
            href="uri" hreflang="languageCode"
            charset"charset" media="media">
      <STYLE attributes>style</STYLE>
      <SCRIPT attributes>script</SCRIPT>
      <OBJECT attributes>contents</OBJECT>
   </HEAD>"
General information on the HTML document is contained in the document head enclosed in the contents of the HEAD element. This element can contain, in any order:
- one mandatory TITLE element
- one optional BASE element
- the following elements, in any number (including 0)
. META element
. LINK element
. STYLE element
. SCRIPT element
. OBJECT element

2.2.1 The HEAD element

The HEAD element has in its contents the elements that compose the document head. Information in this contents is not displayed. Also, scripts in this contents are not run when the document is loaded.

The HEAD element has one specific profile attribute. This references an external document that contains a profile which sets rules for the meta data on the document. These will be discussed in the next section.

The format of the external document where a profile defined, is not part of the HTML specification. Supposedly, browsers know some of the formats. When working with a browser, the referenced profile must obviously be among those known to the browser.

2.2.2 Meta information

Some of the information in the document head can describe the document, without being rendered on the displayed page. Such information, to keep things simple, can be e.g. the author of the page, or the date and time when it expires. This information is called meta data.

Practically, this information is a set of properties each of which has a name and a value.

These are destined for exploitation by the user agent, the server, a search engine or other such creatures that rover about the Internet.

Such properties are defined by means of the META element, or, possibly, for an important class of them, the LINK element. In the META element, the property name is defined by the name, or alternatively, by the http-equiv attribute; the value is defined by the content attribute. The LINK element is defined elsewhere.

One use of the meta data is to tell the server of the data type that the user agent can accept in a response. This information is specified in the http-equiv and content attributes:

<META http-equiv="Content-Type" content="text/html">
This notation defines a property with the name 'Content-Type' and the value 'text/html'. It means that the document contains data of the 'text/html' type.

Meta data defined by the http-equiv are equivalent to, and override, the information defined by the HTTP headers of the same name. For example, if an HTML document has the above META element, this element overrides the information assigned to the HTTP header 'Content-Type' sent along with the document in an HTTP communication message.

Some other uses of the meta data are:

- Create a list of keywords to help search engines find your document:
<META name="keywords" content="HTML, meta, tutorial, header">
- Define the default scripting language for the document
<META http-equiv="Content-Script-Type" content="text/javascript">
- Define the default style sheet language for the document
<META http-equiv="Content-Style-Type" content="text/css">
This in fact defines the default style sheet language, but it is advisable to state the property explicitly.
- Define the default character encoding - this is not to be confused with the character set, although the language in use is confusing enough. The character set for HTML is always UCS (Universal character Set, defined in ISO10646), also known by another name, UNICODE. The encoding method is defined in the charset attribute (hence the confusion) of some elements; it is the method used to encode documents for security when transferring them over a communication network. Example:
<META name="charset" content="ISO-8859-1">
This is the encoding method usable for most Western European languages.

2.2.3 Profiles

A profile sets the rules for defining and using the meta data in a document. It is defined in a separate document referred to by the profile attribute of the HEAD element.

The syntax and processing rules of profiles are not defined in the W3C HTML specification. They are supposed to be known to the user agent in use.

A profile can define, for example, a list of property names that can be used in a document.

2.2.4 The TITLE element The TITLE element defines the title of the document. The Internet Explorer and Netscape Navigator browsers, and most of the user agents, display this title on the title bar of the window.

2.2.5 Other elements in the HEAD contents

The other elements that can be contained in the HEAD contents are:
- LINK
- STYLE
- SCRIPT
- OBJECT

They are described elsewhere.

The LINK element is an alternative to the META element for defining properties that have a URI as a value.

2.3 Document Body

The document body contains all the information to be displayed.

2.3.1 The document contents and the BODY element

The document body is constituted by the contents of the BODY element. In principle, only one BODY element exists in an HTML source page.

The BODY element contents is made up of character text and all the elements that hold the information to be displayed or otherwise processed in view of the displaying.

2.3.2 Inline and block elements

In respect of how they are inserted in the surrounding text, elements fall into two categories:
- Inline elements that are inserted in the same line as the text that precedes and follows it. The element height can exceed the normal line of characters; in this case the line height is adjusted to accommodate the element (dealing only with text, we can have elements with height exceeding the surrounding text when the text in the element is given an enlarged size -- when dealing with images we can have an image worth several regular lines, inserted in a text).
- Block elements that are in a block separated from the surronding text. That is the preceding line is terminated, the element is inserted starting on a new line, and the following text is on a new line below the element.

All that is illustrated by the elements presented in the following.

2.3.3 Data grouping: the DIV and SPAN elements

A part of a document can be isolated from its surrounding text, e.g. for the purpose of giving it special properties (some of the properties are: background color, text color, font size and weight, they are explained in the following; more details on properties are to be found in another chapter.)

The elements used for this purpose are:
- DIV, a block type element
- SPAN, an inline type element

Here is a sample source code using these elements:

If this be not plain enough in the story of Isaac and Ishmael, he that will look into <SPAN style="font-size:70%">I</SPAN> Chron. vs. <SPAN style="font-size:70%">I</SPAN>. may there read these words:<DIV style="font-size:85%; margin:1em">Reuben was the first-born, but foreasmuch as he defies his father's bed, his birth right was given to the sons of Joseph, the son of Israel, and the genealogy is not to be reckoned after the birthright; for Judah prevailed above his brethren, and of him came the chief ruler; but the birthright was Joseph's.</DIV>What this birthright was, Jacob, blessing Joseph (Gen. xlviii. 22), telleth us in these words: <SPAN style="background:#DDFFDD">Moreover, I have given thee one portion above thy brethren, which I took out of the hand of the Amorite with my sword and with my bow. </SPAN>" Whereby it is not only plain that the birthright was nothing but a double portion, but the text in Chronicles is express against our author's doctrine and shows that dominion was not part of the birthright; for it tells us that Joseph had the birthright but Judah the dominion. <DIV style="margin-top:2.5em"> <SPAN style="font-weight:bold">John Locke</SPAN> From <SPAN style="font-style:italic">Two Treatises of Government</SPAN> - The First Treatise - Chapter XI, 115</DIV>
To see the rendered page, please click here
The uses of the SPAN and DIV elements in this example are:
- <SPAN style="font-size:70%">I</SPAN> - The SPAN element contains just the I character; its attribute reduces the font size of this character to 70% of the general size so that it looks like a roman numeral which it is.
- <DIV style="font-size:85%; margin:1em"> - This sets the font size of the DIV contents to 85% of the general size, and a margin worth one line height ('em' stands for one line height) around the block.
- <SPAN style="background:#DDFFDD"> - This sets the background color of the SPAN contents to a greenish color
- <DIV style="margin-top:2.5em"> - This element wraps in the author name and reference title, in order to put them in a block separate from the rest of the text. The attribute style="margin-top:2.5em sets a margin worth 2 and a half lines on top of the block, to separate it from the text above; no margin is set on the three other sides.
- <SPAN style="font-weight:bold%">John Locke</SPAN> - sets the author name in bold face type
- <SPAN style="font-style:italic">Two Treatises of Government</SPAN> - sets the book name in italic typeface
There exist shorter methods to set a typeface to bold or italic, or to cause a line break

2.3.4 Headers: the H1, H2, H3, H4, H5, H6 elements

Headers can be set in the elements H1, H2, ...H6. The browser will render them in enhanced text. H1 is the most important, H6 the least, as shown in the following table:
SourceRenderedNormal text
<H1>header</H1>

Header

normal for comparison
<H2>header</H2>

Header

normal for comparison
<H3>header</H3>

Header

normal for comparison
<H4>header</H4>

Header

normal for comparison
<H5>header</H5>
Header
normal for comparison
<H6>header</H6>
Header
normal for comparison

2.3.5 Paragraphs and lines: the P and BR elements

These elements complete the minimum usually needed to give structure to an HTML page:
- The P element contains a paragraph, i.e. its contents is a block of text separate from the surrounding text
- The BR element forces a line break
The text contained in the P element is formed into a block separated from the preceding text by an empty line. The margin-top property introduced using the style attribute can modify the width of this line.

2.4 Common attributes

These attributes can be used in almost all elements.

2.4.1 The id attribute

The value of an id attribute must be unique in the HTML document, and uniquely identifies an element in the document.

2.4.2 The class attribute

The class attribute assigns the element to the class named by the attribute value. Elements of a given type and pertaining to a specified class can be assigned properties in a style element or a style sheet. Therein, the following notation is used to designate the elements of type 'typename' and class 'classname':
typename.classname
The properties of such elements are defined by the notation:
typename.classname {propertyname:propertyevalue; ...}
Example:
SPAN.sample {background:#FFFFEE; color:blue; font-family:Courier New, monospace; }
This assigns to the SPAN element of the class 'sample' (<SPAN class="sample" ...>) the properties:
- background:#FFFFEE - background of a cream color
- color:blue - foreground color 'blue'
- font-family:Courier New, monospace - font family "Courier New" or an available monospace font

2.4.3 The title attribute

The title attribute specifies a text that pops up when you point on the element with the mouse.

2.4.4 The style attribute

The style attribute can be used to assign a semi-colon separated list of properties to an element:
style="propertyname:propertyvalue; propertyname:propertyvalue; ..."

2.5 Comments

A comment can be inserted anywhere in a HTML document:
- before the <HTML> mark up
- within the <HEAD> element
- between the <HEAD> and the <BODY> elements
- within the <BODY> element
- etc..

The syntax of a comment element is:
<!--comment text -->
where:
<!-- is the 4 character sequence that starts the comment markup
comment text is any desired reasonable comment text
--> is the character sequence ending the comment markup.
A sample page with comments is:
<!-- These comments come before the <HTML> markup -->
<HTML><HEAD><TITLE>USING COMMENTS</TITLE></HEAD>
<!-- These comments come between the HEAD markup 
and the document BODY -->
<BODY>
<!-- comments included in BODY -->
This is a HTML page with comments
</BODY>
</HTML>

3. SYNTAX

3.1 The global structuring elements

The element used to globally structure an HTML document are:
HTMLThe HTML element contains the HTML document
HEADThe HEAD element contains the head information
BODYThe BODY element contains the document body
In place of the BODY element, information to be displayed in a document can be contained in a FRAMESET element. This can contain embedded FRAMESET elements. Such a document is a multi-part document. This is dealt with in another chapter.

3.2 The head contents elements

The HEAD element can contain the following elements:
- TITLE one mandatory element
- META optional, in any number
- LINK optional, in any number
- STYLE optional, in any number
- SCRIPT optional, in any number
- OBJECT optional, in any number

These are described in the following table:
ElementAttributeComments
METASpecifies meta information on the document
attributes:namename="name"
Specifies a property name
contentcontent="cdata"
Specifies the value of the property named by the name attribute
schemescheme="cdata"
Provides complementary information to help the user agent correctly interpret the value specified by the content attribute - for example, if the content is a date in the form "10-9-03", does it mean 10th of september or 9th of october 2003? a scheme value such as "Month-Day-Year" will lift the ambiguity.
http-equivhttp-equiv="name"
Used in place of the name attribute to define a property that can be equivalently defined by an entity header sent along with the document in an HTTP communication -- for example http-equiv="Content-Type" defines the Content-Type property which can also be specified in the Content-Type entity header that comes with the HTML document. The value specified in the content attribute associated with the http-equiv attribute overrides the information in the equivalent HTTP header.
langlang="languageCode"
Specifies the language used in the HTML document.
dirdir="RTL"
Specifies the reading direction of the language used in the document. The default is LTR (Left To Right), RTL is Right To Left.
HEADContains the overall information on the document
attributes:profileprofile="uri-list"
The attribute value is a space-separated list of URIs that identify the locations of meta data profiles (see 2.1.2) above.
Presently, only the first URI in a list is taken into account - but the whole list will be used in the future.
langlang="languageCode"
Specifies the language used in the HTML document.
dirdir="RTL"
Specifies the reading direction of the language used in the document. The default is LTR (Left To Right), RTL is Right To Left.
contents: Elements:
TITLE
BASE
META
LINK
STYLE
SCRIPT
OBJECT
TITLEContains the text to be displayed on the browser title bar.
attributeslanglang="languageCode"
Specifies the language used in the HTML document.
dirdir="RTL"
Specifies the reading direction of the language used in the document. The default is LTR (Left To Right), RTL is Right To Left.
contents:Text to be displayed as title
BASEExplicitly defines the base URI used to evaluate relative URIs in the page (by default, the base URI is the URI that identifies the directory containing the page).
attributeshrefhref="uri" - mandatory
Specifies the base URI
targettarget="targetFrame"- optional
Specifies the name of the frame where the document is to be opened.
contents:empty
LINKDefine links to other documents, such as a style sheet or a profile
for more, see the appropriate document
STYLEDefines style properties for use in the current page.
For more, see the appropriate document
SCRIPTDefines functions to be called in the page, or initializes variables.
for more, see the appropriate document
OBJECTsee the appropriate document

contents introduction reference index previous next