HTML 4 - User's Guide
The HTML global structure
1. SAMPLES
1.1 A sample source page
Here is a sample page that contains the most usual elements to give the page a structure.<HTML> <HEAD> <META http-equiv="Content-Type" content="text/html"> <META name="keywords" content="HTML, user's guide, tutorial"> <TITLE>HTML GLOBAL STRUCTURE</TITLE> </HEAD> <BODY style="background:#DDEEFF"> <H2 style="color:#00A000">Gulliver's Travels</H2> <DIV style="margin:1em; text-align:center"><DIV style="font-style:italic">Articles of Impeachment against Quinbus Flestrin,</DIV>(the Man-Mountain)<BR>ARTICLE I</DIV> <P style="text-indent:20; margin-top:0.3em; text-align:justify">WHEREAS, by a Statute made in the Reign of his Imperial Majesty <SPAN style="font-style:italic;">Calin Deffar Plune</SPAN>, it is enacted, That whoever shall make water within the Precincts of the Royal Palace, shall be liable to the pains and Penalties of High Treason: Notwithstanding, the said <SPAN style="font-style:italic"> Quinbus Flestrin</SPAN>, in open Breach of the said Law, under cover of extinguishing the Fire kindled in the Apartment of his Majesty's most dear Imperial Consort, did maliciously, traitorously, and devilishly, by discharge of his Urine, put out the said Fire kindled in the said Apartment, lying and being within the Precincts of the said Royal Palace; against the Statute in that Case provided, etc. against the Duty etc. </P> <DIV style="font-weight:bold; color:red" title="real author: Jonathan Swift (1667-1745)">Lemuel Gulliver</DIV>From <SPAN style="font-style:italic">TRAVELS into Several Remote Nations of the WORLD</SPAN> </BODY></HTML>To see how the page is rendered, please click here
1.1.1 The HTML page
The HTML page is enclosed between the <HTML> and </HTML> symbols. These are called the starting (or start) and the ending (or end) tag of the HTML element. What is contained between the starting and the ending tags makes up the contents of the element.Internally, the page comprises 2 sections:
- | the head section contained between the <HEAD> and </HEAD> symbols; to put it differently: it is the contents of the HEAD element |
- | the body section that makes up the contents of the BODY element. |
1.1.2 The head section
In the example, the head section contains 3 elements:- | http-equiv="Content-Type" |
- | content="text/html" |
http-equiv
' and 'content
' are the names of the attributes, 'Content-Type
' and 'text/html
' their values. This element defines the content type of the document as being text/html
The second META element has the attributes:
- | name="keywords" |
- | content="HTML, user's guide, tutorial" |
name
' attribute defines a property of the page that is given the name of 'keywords
'. The 'content
' attribute specifies the value of this property as 'HTML, user's guide, tutorial
'.
This element creates the list of 3 keywords HTML
, user's guide
, and tutorial
. This list is intended for the search engines that roam the Internet.
The TITLE element has as contents the title: "HTML GLOBAL STRUCTURE". With the Internet Explorer and Netscape Navigator, as with most of the user agents, this title is displayed on the title bar.
1.1.3 The body section
The body section constitutes the contents of the BODY element. This is the part of the document that is rendered on the user's window.It is a character text interspersed with HTML elements, contained between the <BODY> and </BODY> tags.
The present BODY element has the style="background:#DDEEFF"
attribute. This attribute defines the background color of the body text.
Here are the elements included in the present BODY contents:
H2 | The H2 element contains a header, i.e. a text to be enhanced by the browser (you can see how) - this element has the style="color:#00A000" which defines the foreground, i.e. character color in the element
| ||||||||||
DIV | The DIV element delimits a sequence of text (you see 3 instances of it). Such a sequence is rendered as a block starting on a new line, and ending with a line of its own (the subsequent text is on a new line.) The second instance (made to stand out in purple) is embedded in the first. These instances of the DIV element have the style attribute which defines properties of the text contained in the element:
| ||||||||||
BR | The BR element causes a line break - it has no contents | ||||||||||
P | The P element delimits a paragraph. It has the same effect of isolating a sequence of text as the DIV element, except that by default, it sets an empty line above and below the block. You can create these lines for a DIV block with the margin property, or more specifically the margin-top and margin-bottom properties. Similarly, you can suppress the empty lines above and/or below a P block by using these properties (e.g. style="margin-bottom:0" ). The element presently has the attributes:
| ||||||||||
SPAN | The SPAN element delimits a sequence of text that is rendered in-line with the surrounding. Here the SPAN element is used to assign the italic font style to the enclosed text. |
1.2 Blocks, margins and padding
<DIV style="text-align:justify; text-indent:20; background:#DDCC44"> <DIV style="width:30%; float:left; text-align:justify; font-size:85%; border:thin solid blue; margin:1em; padding:0.5em; background:#DDFFDD;color:blue;">This address was delivered by President Abraham Lincoln, at the dedication of the cemetery at Gettysburg. The Gettysburgh battle was fought from July 1st to July 3rd near the little village by that name, 35 miles southwest of Harrisburg, Pa. It claimed 23,000 casualties from the Federal forces and 20,000 from the Confederates. The Confederate invasion under Gen. Robert E. Lee was stopped there. This battle is generally considered the turning point of the war.</DIV> <P style="text-align:center; font-weight:bold; color:#DD0000"> ADDRESS DELIVERED AT THE DEDICATION OF THE CEMETERY AT GETTYSBURG</P> <DIV>Four scores and seven years ago our fathers brought forth on this continent, a new nation, conceived in Liberty, and dedicated to the proposition that all men are created equal.</DIV> <DIV>Now we are engaged in a great civil war, testing whether that nation, or any nation so conceived and so dedicated, can long endure. We are met on a great battle-field of that war. We have come to dedicate a portion of that field, as a final resting place for those who here gave their lives that that nation might live. It is altogether fitting and proper that we should do this.</DIV> <DIV>But, in a larger sense, we can not dedicate - we can not consecrate - we can not hallow - this ground. The brave men, living and dead, who struggled here, have consecrated it, far above our poor power to add or detract. The world will little note, nor long remember what we say here, but it can never forget what they did here. It is for us the living, rather, to be dedicated here to the unfinished work which they who fought here have thus far so nobly advanced. It is rather for us to be here dedicated to the great task remaining before us - that from these honored dead we take increased devotion to that cause for which they gave the last full measure of devotion - that we here highly resolve that these dead shall not have died in vain - that this nation, under God, shall have a new birth of freedom - and that government of the people, by the people, for the people, shall not perish from the earth. </DIV> <P style="font-weight:bold;">ABRAHAM LINCOLN</P> November 19, 1863 </DIV>To see how the page is rendered, please click here
The entire text is enclosed in a DIV element (starting and ending tags standing out in purple here). This element has a style attribute that defines a number properties common to all the enclosed text, where not otherwise redefined:
align-text:justify | the text is justified (well aligned on the right margin) |
padding:1em | a padding with a thickness equal to 1 line height borders the block |
type-indent:20 | the first line of the text is idented by 20 pixels. |
background:#DDCC44 | the background is of a yellowish color |
The first embedded DIV element has a style attribute that defines the following properties:
width:30% | the element block is allocated 30% of the page width (or more acurately, 30% of the containing DIV block, but the latter extends over the entire page width, by default) |
float:left | the block is floated to the left and lets the surrounding text flow on its right |
font-size:85% | the font size is 85% of the surrounding text |
border:blue | the text block is surrounded by a blue border |
margin:1em | the block is surrounded by a margin with the thickness of 1 line height (the margin is on the outer side of the border) |
padding:0.5em | the block is padded by a surrounding space with the thickness of 0.5 line height (the padding comes between the text and the border) |
background:#DDFFDD | the background is of a greenish color |
color:blue | the text characters are blue |
The P element contains the document title : 'ADDRESS DELIVERED AT THE DEDICATION OF THE CEMETERY AT GETTYSBURG'. This element has the properties to center the text in the available space, and to render it in bold and red characters.
The essential part of the text is enclosed in three DIV elements. The DIV elements cause line breaks between these parts of the text. These elements without attributes of their own inherit all the properties from the enclosing DIV element, particularly the indentation. You can use the BR element to cause line breaks. But then only the very first line in the block is indented. A line that follows a BR element is not.
2. USAGE
2.0 Terminology
The HTML language uses markups to specify treatments to be carried out on a document.
The markups come in the form of tags that most often go by pair:
- a starting tag -- example: <DIV>
- an ending tag -- example: </DIV>
The ending tag is sometimes omitted
The starting and the ending tags, and the data between them constitute an element
The data between the starting and the ending tags are the contents of the element.
A starting tag has the following syntax, with the tag contents enclosed between acute brackets:
<tagname attributes> |
<
sign, without intervening spaces. This name is part of the HTML language.This name is also called the type of the element.
The attributes which can be mandatory or optional, convey complementary information on the elements. They are to be found in the starting tag and have the form:
name="value" |
- | name is the name of the attribute
|
- | value is the value of the attribute
|
style="font-weight:bold"
-- 'style
' is the attribute name -- 'font-weight:bold
' is the attribute value.
Each element type has a set of attributes defined by the HTML language. Each attribute has a set of allowable values.
The quotes (single or double) around the value are not always required, but are recommended.
A ending tag has the following syntax:
</tagname> |
</
symbol, without intervening spaces.
2.1 The HTML page
The whole of an HTML document is contained between an <HTML> and a </HTML> tag.
It is composed of 2 parts:
- | a document head that contains general information on the document; this information controls the rendering of the document, but is not displayed. |
- | a document body displayed for the user to see. |
2.2 Document head
The following is the syntax of the HEAD element with the elements that it can contain. These elements are presented with some (but not all) of their attributes. Complete description of each element is to be found elsewhere.<HEAD profile="uri" lang="languageCode" dir="LTR|RTL"> <TITLE>contents</TITLE> <BASE href="uri" target="frameTarget"> <META http-equiv="name" name="name" content=cdata" scheme="cdata"> <LINK rel="linkType" type="contentType" href="uri" hreflang="languageCode" charset"charset" media="media"> <STYLE attributes>style</STYLE> <SCRIPT attributes>script</SCRIPT> <OBJECT attributes>contents</OBJECT> </HEAD>"General information on the HTML document is contained in the document head enclosed in the contents of the HEAD element. This element can contain, in any order:
- | one mandatory TITLE element | ||||||||||
- | one optional BASE element | ||||||||||
- | the following elements, in any number (including 0)
|
2.2.1 The HEAD element
The HEAD element has in its contents the elements that compose the document head. Information in this contents is not displayed. Also, scripts in this contents are not run when the document is loaded.The HEAD element has one specific profile attribute. This references an external document that contains a profile which sets rules for the meta data on the document. These will be discussed in the next section.
The format of the external document where a profile defined, is not part of the HTML specification. Supposedly, browsers know some of the formats. When working with a browser, the referenced profile must obviously be among those known to the browser.
2.2.2 Meta information
Some of the information in the document head can describe the document, without being rendered on the displayed page. Such information, to keep things simple, can be e.g. the author of the page, or the date and time when it expires. This information is called meta data.Practically, this information is a set of properties each of which has a name and a value.
These are destined for exploitation by the user agent, the server, a search engine or other such creatures that rover about the Internet.
Such properties are defined by means of the META element, or, possibly, for an important class of them, the LINK element. In the META element, the property name is defined by the name, or alternatively, by the http-equiv attribute; the value is defined by the content attribute. The LINK element is defined elsewhere.
One use of the meta data is to tell the server of the data type that the user agent can accept in a response. This information is specified in the http-equiv and content attributes:
<META http-equiv="Content-Type" content="text/html"> |
Meta data defined by the http-equiv are equivalent to, and override, the information defined by the HTTP headers of the same name. For example, if an HTML document has the above META element, this element overrides the information assigned to the HTTP header 'Content-Type' sent along with the document in an HTTP communication message.
Some other uses of the meta data are:
- | Create a list of keywords to help search engines find your document:
| ||
- | Define the default scripting language for the document
| ||
- | Define the default style sheet language for the document
| ||
- | Define the default character encoding - this is not to be confused with the character set, although the language in use is confusing enough. The character set for HTML is always UCS (Universal character Set, defined in ISO10646), also known by another name, UNICODE. The encoding method is defined in the charset attribute (hence the confusion) of some elements; it is the method used to encode documents for security when transferring them over a communication network. Example:
|
2.2.3 Profiles
A profile sets the rules for defining and using the meta data in a document. It is defined in a separate document referred to by the profile attribute of the HEAD element.The syntax and processing rules of profiles are not defined in the W3C HTML specification. They are supposed to be known to the user agent in use.
A profile can define, for example, a list of property names that can be used in a document.
2.2.4 The TITLE element The TITLE element defines the title of the document. The Internet Explorer and Netscape Navigator browsers, and most of the user agents, display this title on the title bar of the window.
2.2.5 Other elements in the HEAD contents
The other elements that can be contained in the HEAD contents are:
The LINK element is an alternative to the META element for defining properties that have a URI as a value.
2.3 Document Body 2.3.1 The document contents and the BODY element
The BODY element contents is made up of character text and all the elements that hold the information to be displayed or otherwise processed in view of the displaying.
2.3.2 Inline and block elements
- | Inline elements that are inserted in the same line as the text that precedes and follows it. The element height can exceed the normal line of characters; in this case the line height is adjusted to accommodate the element (dealing only with text, we can have elements with height exceeding the surrounding text when the text in the element is given an enlarged size -- when dealing with images we can have an image worth several regular lines, inserted in a text). |
- | Block elements that are in a block separated from the surronding text. That is the preceding line is terminated, the element is inserted starting on a new line, and the following text is on a new line below the element. |
All that is illustrated by the elements presented in the following.
2.3.3 Data grouping: the DIV and SPAN elements
A part of a document can be isolated from its surrounding text, e.g. for the purpose of giving it special properties (some of the properties are: background color, text color, font size and weight, they are explained in the following; more details on properties are to be found in another chapter.)
The elements used for this purpose are:
- DIV, a block type element
- SPAN, an inline type element
Here is a sample source code using these elements:
- | <SPAN style="font-size:70%">I</SPAN> - The SPAN element contains just the I character; its attribute reduces the font size of this character to 70% of the general size so that it looks like a roman numeral which it is. |
- | <DIV style="font-size:85%; margin:1em"> - This sets the font size of the DIV contents to 85% of the general size, and a margin worth one line height ('em' stands for one line height) around the block. |
- | <SPAN style="background:#DDFFDD"> - This sets the background color of the SPAN contents to a greenish color |
- | <DIV style="margin-top:2.5em"> - This element wraps in the author name and reference title, in order to put them in a block separate from the rest of the text. The attribute style="margin-top:2.5em sets a margin worth 2 and a half lines on top of the block, to separate it from the text above; no margin is set on the three other sides.
|
- | <SPAN style="font-weight:bold%">John Locke</SPAN> - sets the author name in bold face type |
- | <SPAN style="font-style:italic">Two Treatises of Government</SPAN> - sets the book name in italic typeface |
2.3.4 Headers: the H1, H2, H3, H4, H5, H6 elements
Headers can be set in the elements H1, H2, ...H6. The browser will render them in enhanced text. H1 is the most important, H6 the least, as shown in the following table:Source | Rendered | Normal text | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
<H1>header</H1> | Header | <H2>header</H2> | Header | normal for comparison | <H3>header</H3> | Header | normal for comparison | <H4>header</H4> | Header | normal for comparison | <H5>header</H5> | Header | normal for comparison | <H6>header</H6> | Header | normal for comparison |
2.3.5 Paragraphs and lines: the P and BR elements
These elements complete the minimum usually needed to give structure to an HTML page:- | The P element contains a paragraph, i.e. its contents is a block of text separate from the surrounding text |
- | The BR element forces a line break |
2.4 Common attributes
These attributes can be used in almost all elements.2.4.1 The id attribute
The value of an id attribute must be unique in the HTML document, and uniquely identifies an element in the document.2.4.2 The class attribute
The class attribute assigns the element to the class named by the attribute value. Elements of a given type and pertaining to a specified class can be assigned properties in a style element or a style sheet. Therein, the following notation is used to designate the elements of type 'typename' and class 'classname':typename.classname |
typename.classname {propertyname:propertyevalue; ...} |
SPAN.sample {background:#FFFFEE; color:blue; font-family:Courier New, monospace; } |
<SPAN class="sample" ...>
) the properties:
- | background:#FFFFEE - background of a cream color |
- | color:blue - foreground color 'blue' |
- | font-family:Courier New, monospace - font family "Courier New" or an available monospace font |
2.4.3 The title attribute
The title attribute specifies a text that pops up when you point on the element with the mouse.2.4.4 The style attribute
The style attribute can be used to assign a semi-colon separated list of properties to an element:style="propertyname:propertyvalue; propertyname:propertyvalue; ..." |
2.5 Comments
A comment can be inserted anywhere in a HTML document:<!-- | is the 4 character sequence that starts the comment markup |
comment text | is any desired reasonable comment text |
--> is the character sequence ending the comment markup.
| |
<!-- These comments come before the <HTML> markup --> <HTML><HEAD><TITLE>USING COMMENTS</TITLE></HEAD> <!-- These comments come between the HEAD markup and the document BODY --> <BODY> <!-- comments included in BODY --> This is a HTML page with comments </BODY> </HTML> |
3. SYNTAX
3.1 The global structuring elements
The element used to globally structure an HTML document are:HTML | The HTML element contains the HTML document |
HEAD | The HEAD element contains the head information |
BODY | The BODY element contains the document body |
3.2 The head contents elements
The HEAD element can contain the following elements:Element | Attribute | Comments | |
---|---|---|---|
META | Specifies meta information on the document | ||
attributes: | name | name="name" Specifies a property name | |
content | content="cdata" Specifies the value of the property named by the name attribute | ||
scheme | scheme="cdata" Provides complementary information to help the user agent correctly interpret the value specified by the content attribute - for example, if the content is a date in the form "10-9-03", does it mean 10th of september or 9th of october 2003? a scheme value such as "Month-Day-Year" will lift the ambiguity. | ||
http-equiv | http-equiv="name" Used in place of the name attribute to define a property that can be equivalently defined by an entity header sent along with the document in an HTTP communication -- for example http-equiv="Content-Type" defines the Content-Type property which can also be specified in the Content-Type entity header that comes with the HTML document. The value specified in the content attribute associated with the http-equiv attribute overrides the information in the equivalent HTTP header.
| ||
lang | lang="languageCode" Specifies the language used in the HTML document. | ||
dir | dir="RTL" Specifies the reading direction of the language used in the document. The default is LTR (Left To Right), RTL is Right To Left. | ||
HEAD | Contains the overall information on the document | ||
attributes: | profile | profile="uri-list" The attribute value is a space-separated list of URIs that identify the locations of meta data profiles (see 2.1.2) above. Presently, only the first URI in a list is taken into account - but the whole list will be used in the future. | |
lang | lang="languageCode" Specifies the language used in the HTML document. | ||
dir | dir="RTL" Specifies the reading direction of the language used in the document. The default is LTR (Left To Right), RTL is Right To Left. | ||
contents: | Elements:
TITLE | ||
TITLE | Contains the text to be displayed on the browser title bar. | ||
attributes | lang | lang="languageCode" Specifies the language used in the HTML document. | |
dir | dir="RTL" Specifies the reading direction of the language used in the document. The default is LTR (Left To Right), RTL is Right To Left. | ||
contents: | Text to be displayed as title | ||
BASE | Explicitly defines the base URI used to evaluate relative URIs in the page (by default, the base URI is the URI that identifies the directory containing the page). | ||
attributes | href | href="uri" - mandatorySpecifies the base URI | |
target | target="targetFrame" - optionalSpecifies the name of the frame where the document is to be opened. | ||
contents: | empty | ||
LINK | Define links to other documents, such as a style sheet or a profile for more, see the appropriate document | ||
STYLE | Defines style properties for use in the current page. For more, see the appropriate document | ||
SCRIPT | Defines functions to be called in the page, or initializes variables. for more, see the appropriate document | ||
OBJECT | see the appropriate document |