Most of us are accustomed to using the classic building blocks of HTML and XHTML such as headings, paragraphs, lists, tables and forms, but I would venture to say that precious few of us really use the phrase elements—the semantic tags that make up what I call “the King’s (X)HTML.”

Let me briefly review for you (or introduce to you?) the various phrase elements, what they mean, and how they are used. Later I will show you how a few “forgotten” elements can be made to work on popular browsers, including Internet Explorer 4, 5 and 6.

A Brief Review (or Introduction)

The following comes straight from the W3C HTML 4 specification (I have added a few footnotes). Another element which could be added to this list is Q, which is used to indicate short quotations. More on Q later.

Element names appear here in uppercase letters, but if you write XHTML, be sure to use lowercase (you probably know that already).

EM
Indicates emphasis.
STRONG
Indicates stronger emphasis.
CITE
Contains a citation or a reference to other sources.[1]
DFN
Indicates that this is the defining instance of the enclosed term.
CODE
Designates a fragment of computer code.
SAMP
Designates sample output from programs, scripts, etc.[2]
KBD
Indicates text to be entered by the user.[3]
VAR
Indicates an instance of a variable or program argument.
ABBR
Indicates an abbreviated form (e.g., WWW, HTTP, URI, Mass., etc.).
ACRONYM
Indicates an acronym (e.g., WAC, radar, etc.).[4]

[1] CITE is often used to enclose titles of books, movies or songs. A dictionary defines the word as “to mention something or somebody as an example to support an argument or help explain what is being said.”

[2] The meaning of SAMP has changed since Tim Berners-Lee defined it in 1993. At that time it was used to denote a sequence of literal characters.

Some people have said that names of elements and attributes in (X)HTML should be marked up as SAMP, but I disagree with that based on the current definition. To complicate matters, this is how the element is used on the very page that gives the specification. So it’s either a matter of “Do as I say, not as I do” or “Take your choice.”

[3] I use KBD also to indicate the use of certain keys such as Ctrl or Alt since this denotes a keyboard function, although it does not exactly refer to text entered by the user. Again, the usage is not clearly defined; it’s your call.

[4] MSN Learning & Research Dictionary defines four types of abbreviations: shortenings, contractions, initialisms and acronyms. The first two types probably don’t need to be marked up in most cases, since anyone who knows the language will be familiar with them (examples include Dr. = doctor, St. = saint). Initialisms (pronounced as separate letters) should be marked up as ABBR, and acronyms (pronounced as words rather than as a series of letters) should be marked up as ACRONYM.

A title attribute should be used with ABBR or ACRONYM at the first appearance of the enclosed term on a given page (or section of a page) to provide the expanded meaning of the term (for example, <abbr title="World Wide Web">WWW</abbr>). Most browsers will provide a little pop-up tooltip containing this information, when the user hovers a mouse over the item.

Fixing What Breaks in IE

A few standard elements of HTML are unsupported or poorly supported by Internet Explorer 4, 5 and 6. Remedies for this problem are presented here along with some styling for better browsers.

The q element

NOTE: This site no longer uses the innerHTML JavaScript mentioned below because it makes the screen flicker during page load, which can become quite annoying. To satisfy Internet Explorer (which is used by 90% of our visitors), all Q elements on the site have been replaced by curly quotation marks, and all ABBR elements have been replaced by ACRONYM, effective with Version 10.6 of this site.

The q element, used to mark up short quotations, is recognized as an element by Internet Explorer and can be styled, but Internet Explorer will not automatically enclose it in quotation marks, as better browsers do. Some webmasters will style it as italic text, but I have come up with a more effective solution: JavaScript using the proprietary innerHTML method. The script, part of which appears below, actually replaces the <q> tags with curly quotation marks; even JAWS for Windows, the popular screen reader, will read them as quotes. Replacing the <q> tags is better than including them with the quotation marks, according to the W3C spec:

Authors should not put quotation marks at the beginning and end of the content of a Q element.

The sample script follows. Do not use this; it will not work! Much of it has been omitted to make it easier to follow (Omissions are denoted by three dots ...). Please use the full version of the script instead.

function fixIE() {
var oldBodyText, reg, second... reg3, fourth... newBodyText

oldBodyText = document.body.innerHTML;
// IE6 won't supply quotation marks with Q,
// so replace the tags with quotation marks.

// Find Q tags with attributes
// (use class="nested" for nested quotes)
reg = /<Q([^>]+)>([^<]+)<\/Q>/g;
second = oldBodyText.replace(reg, '&#8216;$2&#8217;');

...

// Find normal Q elements with only text inside
reg3 = /<Q([^>]*)>([^<]+)<\/Q>/g;
fourth = third.replace(reg3, '&#8220;$2<SPAN>&#8221;</SPAN>');

...

// now replace original HTML with amended HTML
document.body.innerHTML = newBodyText;
}

window.onload = function(){fixIE()};

First, the script replaces any <q> tags containing attributes, with &#8216; and &#8217;—the entity codes for single curly quotation marks. I add the attribute class="nested" to the <q> tag whenever I nest quotations, so Internet Explorer will find it. This would otherwise be unnecessary since a multiple selector (q>q) can be used with CSS (which is exactly what I do for better browsers).

Next, the script replaces all remaining <q> tags with &#8220; and &#8221;—the entity codes for normal (double) curly quotation marks. Each end quote is wrapped in a <SPAN> tag (see styling below).

If this was the whole script, and any other tag occured within a q element, the script would fail, and the quotation marks would not appear. To cover this, I have added several extra steps in the full version of the script, which runs on The Oo Kingdom. Also included in the script is an extra step (based on a script by Jacques Distler) which makes <abbr> and <acronym> work on Internet Explorer. A second script is provided which hides the main script from other browsers that can choke on it.

I tried to provide special styling for Internet Explorer in case the script fails, but this messes up IE 5 terribly. I do provide styling for better browsers such as Netscape 7, Opera 7 and Mozilla Firebird, which normally render all q elements with straight quotes, even if they are nested:

/* use curly quotes around Q in good browsers */
q:before {content: "\201C";}
q:after {content: "\201D";}
q>q:before {content: "\2018";}
q>q:after {content: "\2019";}

/* nix end quote if it flows into the next paragraph */
.continue q:after {content: "";}
.continue q>q:after {content: "\2019";}
.continue span {display: none;} /* for IE */

The quotation marks are styled for better browsers as curly quotes; single curlies are provided for nested quotations (hence the q>q).

Finally, all browsers are forced to observe an often forgotten grammatical rule: if a quote continues into the next paragraph, the end quote should be omitted. The .continue q>q:after line ensures that nested quotes do not also receive this treatment. The last line is for Internet Explorer; the script wraps each end quote in a <span> tag for styling.

By the way, the Lynx text browser handles the q element perfectly, right down to nested q elements and single curly quotes, with no help from style sheets at all.

It does not, however, observe the grammatical rule stated above; no browser can know whether or not a quotation is continued into the next paragraph without being told so. I use class="continue" to denote this. Internet Explorer will be fouled up if I add the class attribute to the <q> tag, so I wrap the quote in a <span> tag or else add the class attribute to the p element containing the quotation.

I know this is a hair-splitting issue, but it does make for clearer reading in the rare instances where this occurs. The W3C stipulates that q “is intended for short quotations (inline content) that don’t require paragraph breaks.” But the alternative would be to use blockquote, which is not always appropriate in such instances.

Making <abbr> and <acronym> work right

Good browsers, such as Netscape 7, Opera 7 and Mozilla Firebird, support the following CSS snippet. It finds any abbr or acronym elements containing title attributes, and gives them a dotted underline and a question mark cursor on hover.

abbr[title], acronym[title] {
    border-bottom: 1px dotted; cursor: help;}

Internet Explorer supports acronym but not abbr. It also doesn’t support the CSS2 attribute selectors used in the style sheet. So I use innerHTML JavaScript (included with the full version cited above) to wrap the content of the abbr or acronym with <span class="abbr"> which can then be styled with the CSS below. For some reason, IE will miss the styling if it isn’t on a separate line, so don’t try to combine the two lines!

span.abbr {border-bottom: 1px dotted; cursor: help;}

It’s also a good idea to provide aural styling for these elements, for the benefit of screen readers:

abbr {speak: spell-out;}
acronym {speak: normal;}

Conclusion

Using phrase elements in (X)HTML will ensure that the proper meaning is provided in each case. This aids not only users of visual browsers but alternative Internet devices as well. In addition, using the q element keeps the markup less cluttered than using the longer entity codes for curly quotation marks, which many of us do. In short, it pays to use “the King’s (X)HTML”!