contents introduction reference index previous next

HTML 4 - User's Guide
Character and text presentation

Contents:
1. SAMPLES
1.1 Character highlighting
1.1.1 Character escaping
1.1.2 Character highlighting elements
1.1.3 The style attribute
1.2 Text presentation
2. USAGE
2.1 Character reference
2.1.1 Alphabetic reference
2.1.2 Numeric reference
2.2 Special character processing
2.2.1 Space
2.2.2 Hyphen
2.3 Character and text highlighting
3. SYNTAX
3.1 Character highlighting elements
3.2 Text presentation element
3.3 Horizontal rules
3.4 Common attributes
This chapter presents the elements that can be classified as character enhancement or text presentation elements.

1. SAMPLES

1.1 Character highlighting

A sample source page using text highlighting is this:
Here is a sample JavaScript code:
<PRE><CODE>writeln("<B style="COLOR:red">Hello World!</B>");</CODE></PRE>
For this code <EM style="color:blue;">to run</EM>, it <STRONG>must</STRONG> be 
enclosed in a <SMALL style="font-weight:bold">SCRIPT</SMALL> element. It then 
writes the following to the <ABBR title="HyperText Markup Language">HTML</ABBR> 
page:<BR>
<SAMP style="color:red; font-weight:bold;">Hello World!</SAMP>
</PRE>
This page is rendered as:
Here is a sample JavaScript code:
writeln("Hello World!");
For this code to run, it must be enclosed in a SCRIPT element. It then writes the following to the HTML page:
Hello World!

1.1.1 Character escaping
The underlying source of what you see on the white background is exactly what you see on the cream background. The source code for what you see on the cream background is not exactly what you see. The slight modification needed was to replace all the < signs by &lt;. So when you see <PRE>, the source code really is &lt;PRE>. This is called character escaping. You have to do this to display the 'less than' sign (<), because the browser interprets all that is to the right of a < sign, up to the next > sign, as an HTML tag, and will not display it.

1.1.2 Character highlighting elements
The elements used in this page are:

PREPreformatted text - space and line feed characters are rendered as in the source text
CODEIntended for program code - characters are rendered in monospace font
Bbold typeface
EMEmphasis - rendering is browser dependent (Internet Explorer and Netscape: italics)
STRONGStrong font characters - rendering is browser dependent (Internet Explorer and Netscape: bold)
SMALLSmall font - used here to reduce the size of the 'SCRIPT' word
ABBRAbbreviation - the title attribute defines a text (one would expect the full length description of the abbreviation) that pops up when you point on the abbreviation with the mouse - this works with Netscape 7.1, not with Internet Explorer 6.0.
SAMPSample computer output, usually rendered in monospace font

1.1.3 The style attribute
All of the elements can have the 'style' attribute which assigns further properties to the text enclosed in the element contents. The exemplified properties are:

color:blueThe 'color' property (foreground, i.e. character, color) has the 'blue' value.
font-weight:boldThe 'font-weight' property (typeface) has the 'bold' value.
Note that no quote sign surrounds the value.

1.2 Text presentation

<HTML><HEAD><TITLE>TOM JONES</TITLE></HEAD><BODY style="background:#BBEEDD;">
<STRONG style="background:#FFDDDD"><FONT color="RED">A CONSPIRACY</FONT></STRONG>
<P style="background:#DDDDFF;">
But affairs were not in so quiet a situation in the bosom of the other conspirator; 
his mind was tossed in all the distracting anxiety so nobly described by 
Shakespeare<SUP style="font-size:70%">1</SUP>:
<BLOCKQUOTE cite="ILLJuliusCaesar.html" 
            style="color:blue; background:#DDFFDD;">
Between the acting of a dreadful thing<BR>
And the first motion, all the interim is<BR>
Like a phantasma or a hideous dream.<BR>
The genious and the mortal instruments<BR>
Are then in council; and the state of man,<BR>
Like to a little kingdom, suffers then<BR>
The nature of an insurrection.</BLOCKQUOTE>
<P style="background:#DDDDFF;">Though the violence of his passion had made him 
eagerly embrace the first hint of this design, especially as it came from a relation 
of the lady, yet when that friend to reflection, a pillow, had placed the action itself 
in all its natural black colours before his eyes, with all the consequences which must, 
and those which might probably, attend it, his resolution began to abate, or rather, 
indeed, to go over to the other side; and after a long conflict, which lasted a whole 
night, between honour and appetite, the former at length prevailed, and he determined 
to wait on Lady Bellaston and to relinquish the design.
</P>
<B style="color:red;">Henry Fielding</B><BR>
From <I>Tom Jones</I>
<HR size="2">
<CITE style="background:lime; font-size:75%;">1. Julius Caesar II, I, 63</CITE>
</BODY></HTML>
To see how the page is rendered, please click here.

The elements used in this page are:

STRONGStrong font characters - rendering is browser dependent (Internet Explorer and Netscape: bold)
FONTdeprecated - Font properties
PIts content is a paragraph. The first paragraph ends at the <BLOCKQUOTE> element. The second paragraph ends with an ending </P> element.
SUPSuperscript
Bbold typeface
BLOCKQUOTEA block of quotation. The browser renders this by indenting the text. The BLOCKQUOTE element is sometimes used just for that.
IItalic typeface
CITEIntended for citations
HRdraws a horizontal rule across the page

The attributes

style<STRONG style="background:#FFDDDD"> - this defines the background color for the contents of the STRONG element
<P style="background:#DDDDFF"> - this defines the background color for the <P> element contents (which ends at the <BLOCKQUOTE> block)
<SUP style="font-size=70%"> - defines the font size for the superscript, set at 70% of the general text size.
<BLOCKQUOTE style="color:blue;background:#DDFFDD"> - defines the foreground and background colors in the BLOCKQUOTE contents
<CITE style="background:lime; font-size:75%"> - defines the background and font size for the CITE element contents
etc...
cite<BLOCKQUOTE cite="ILLJuliusCaesar.html"> - refers to the quoted block source document

2. USAGE

2.1 Character reference

The < and & characters cannot be included "as is" in the HTML page, because they have special functions:
- the < sign is the start of an HTML tag, so the browser interprets all of the characters to its right, up to the next > sign, as tag information not to be displayed; therefore it has to be represented by a so called character reference which starts with an amperdsand (&).
- the & character can stand alone, but if appended with non-white characters to its right, it is interpreted as the start of a character reference; so this character also has to be represented by a character reference.

All the < and & signs you see on this page are represented by their character references in the source code.

2.1.1 Alphabetic reference

The most popular character references are:
&lt; to represent <less than
&gt; to represent >greater than
&amp; to represent &ampersand
&quot; to represent "double quote
&apos; to represent ' single quote(this is recognized by Netscape, not by Internet Explorer)
&shy; to represent ­
soft hyphen (you do not see it unless there is a line break)
&nbsp; to represent a space
Only the &lt; and &amp; representations are necessary, since the other characters can be directly included in an HTML page. The &nbsp; representation has a special use; we shall be returning to this in a moment.

2.1.2 Numeric reference

Another representation of a character is by its numeric decimal or hexadecimal code.

The decimal code representation is &#n, where n is the numeric code value, in the ordinary decimal numeration system.

The hexadecimal representation is &#xhhhh where:
- the &#x is to be written as is
- hhhh is the 4-digit hexadecimal value of the code, with leading 0's if required

2.2 Special character processing

2.2.1 Space

The HTML language admits the use of white space characters for the convenience of human reading of the source code.

White space is one of these characters, with their numeric hexadecimal references:

- space #x0020
- tab #x0009
- form feed #x000C
- zero-width space #x200B
- carriage return #x000D
- line feed #x000A
The browser ordinarily renders a sequence of white space characters by a single space.

To force multiple spaces into the rendered page, you use the notation &nbsp;
A &nbsp; space is treated as a regular character. So two words connected by a &nbsp; is treated as one, and will not be separated on the rendered page.

2.2.2 Hyphen

The browser will not break a word at line end, unless you instruct it to do so. You do this by inserting a soft hyphen at the location where a break is allowed. The soft hyphen is represented by :
&shy; or &#xAD;
This is a sample source:
<TABLE style="border:thin solid red"><TR><TD width="140">
hyphen&shy;ated<br>
this word is hyphen&shy;ated<br>
is this word hyphenated ?</TABLE>
This will be rendered as:
hyphen­ated
this word is hyphen­ated
is this word hyphenated ?

2.3 Character and text highlighting

Character highlighting and text presentation elements can be used anywhere in a page. The characters placed in the contents of such elements are to be presented in the requested form -- a few elements are not supported by all browsers (blinking is supported by Netscape Navigator 7.1, not by Internet Explorer 6.0.)

These elements can be classified in 4 categories:

- elements that act on character formats:
. Bbold typeface
. BIGbig characters
. BLINKblinking characters
. EMemphasized text
. FONTfont properties
. Iitalic typeface
. SMALLsmall characters
. STRIKEstruck characters
. STRONGstrong typeface
. TTmonospace font
- elements that mark the text as being of a certain type -- the browsers are expected to render them in some specific fashion, or authors can assign them special properties using the style sheet or STYLE element:
. ABBRthe text is an abbreviation
. ACRONYMthe text is an acronym
. DFNthe text is a definition
. CITEthe text is a citation
. CODEthe text is a program code
. KBDthe text is a keyboard entered sequence
. SAMPthe text is a sample program output
. VARthe text is an instance of a variable
- elements that act on text blocks
. PREpreformatted block
. BLOCKQUOTEquoted text block - indented
. SUBsubscript
. SUPsuperscript
- elements that define paragraphs and line breaks
. Pparagraph
. BRline break
- elements to note insertion and deletion
. INSinserted text
. DELdeleted text
- horizontal rule
. HRdraws a horizontal rule across the page

3. SYNTAX

The elements are classified into font style and text presentation element, in accordance with the W3C documentation

3.1 Font style elements

This table shows the character highlighting elements and their effect. More flexibility is gained by using the style sheet properties.

All the elements in this table have the common attributes.

Near equivalent style properties:
1. font-weight:bold
2. font-size:larger
3. color:blue
4. font-size:110%
5. font-style:italic
6. font-size:small
7. font-family:monospace - program code as specified by the CODE element is also in fix pitch font, but characters in CODE and TT are rendered differently
TagSourceRenderedComment
B<B>characters</B>charactersBold font1
BIG<BIG>characters</>charactersBig font2
BLINK<BLINK>characters</BLINK>charactersBlinking characters
FONT color<FONT color="blue">characters</FONT>charactersCharacter color3
FONT size<FONT size="+1">characters</>charactersSize control4
I<I>characters</I>charactersItalic font5
SMALL<SMALL>characters</SMALL>charactersSmall font6
STRIKE<STRIKE>characters</STRIKE>charactersStruck out characters
TT<TT>characters</TT>charactersFixed pitch7

3.2 Text presentation element

The BR element has the core attributes. All the others have the common attributes

1. The title attribute of the element can be used to give the full length description of the abbreviation - when the mouse pointer hovers above the element, this title will pop up (this does not work with Internet Explorer 6.0 - it does with Nescape Navigator 7.1).
2. The cite attribute can refer to the source document of the quoted text
3. The clear attribute is deprecated - used the float property to allow text to float to the left or the right of an object
TagAttributesSourceRenderedComment
ABBRcommon attributes<ABBR>text</ABBR>textAbbreviation1
ACRONYMcommon attributes<ACRONYM>
text
</ACRONYM>
textAcronym
BLOCKQUOTEcommon attributes
cite="uri"
<BLOCKQUOTE>
text
</BLOCKQUOTE>
text
Quotation block - the text is indented2
BRcore attributes
clear="left|right|all"
<BR>text</BR>
text
Line break3
CITEcommon attributes<CITE>text</CITE>textCitation
CODEcommon attributes<CODE>text</CODE>textProgram code
DELcommon attributes
cite="uri"
datetime="datetime"
<DEL>text</DEL>textDeleted text - the URI points to a document where the deletion reason is explained - the datetime attribute specifies the deletion date and time
DFNcommon attributes<DFN>text</DFN>textThe text is a definition
EMcommon attributes<EM>text</EM>textThe text is emphasized - rendering is browser dependent
INScommon attributes
cite="uri
datetime="datetime
<INS>text</INS>textInserted text - the URI points to an explanation document - the datetime attribute specifies the insertion date and time
KBDcommon attributes<KBD>text</KBD>textData entered by the user from the keyboard
Pcommon attributesxxx<P>text</P>yyyxxx

text

yyy
Paragraph - the text is preceded and followed by a blank line
PREcommon attributes<PRE>text xxxx</PRE>
text
xxxx
Preformatted text - rendered as is, i.e. space and line feed are exactly rendered on the result document
Qcommon attributes
cite="uri"
<Q>text</Q>textSingle line quotation2
SAMPcommon attributes<SAMP>text</SAMP>textSample computer output
STRONGcommon attributes<STRONG>
text
</STRONG>
textThe text is in strong typeface
SUBcommon attributestext<SUB>sub</SUB>textsubSubscript
SUPcommon attributestext<SUP>sup</SUP>textsupSupperscript
VARcommon attributes<VAR>text</VAR>textThe text is an instance of a variable or program argument

3.3 Horizontal rule

A horizontal rule can be drawn across the page using the HR element. This element has all the common attributes

The element deprecated size attribute defined the rule height. Now you can use the 'height' property:

<HR style="height:20;background:#FFDDDD">
yields this:

3.4 Common attributes

The core attributes include the following: id, class, style, title

The common attributes include the above, plus the language specification attributes: lang, dir and the intrinsic event handling attributes

For more details, please click here


contents introduction reference index previous next