Most of us are accustomed to using the classic building blocks of HTML and XHTML such as headings, paragraphs, lists, tables and forms, but I would venture to say that precious few of us really use the phrase elements—the semantic tags that make up what I call “the King’s (X)HTML.”
Let me briefly review for you (or introduce to you?) the various phrase elements, what they mean, and how they are used. Later I will show you how a few “forgotten” elements can be made to work on popular browsers, including Internet Explorer 4, 5 and 6.
A Brief Review (or Introduction)
The following comes straight from the W3C HTML 4
specification (I have added a few footnotes). Another element which
could be added to this list is Q, which is used to
indicate short quotations. More on Q
later.
Element names appear here in uppercase letters, but if you write XHTML, be sure to use lowercase (you probably know that already).
EM- Indicates emphasis.
STRONG- Indicates stronger emphasis.
CITE- Contains a citation or a reference to other sources.[1]
DFN- Indicates that this is the defining instance of the enclosed term.
CODE- Designates a fragment of computer code.
SAMP- Designates sample output from programs, scripts, etc.[2]
KBD- Indicates text to be entered by the user.[3]
VAR- Indicates an instance of a variable or program argument.
ABBR- Indicates an abbreviated form (e.g., WWW, HTTP, URI, Mass., etc.).
ACRONYM- Indicates an acronym (e.g., WAC, radar, etc.).[4]
[1] CITE
is often used to enclose titles of books, movies or songs.
A dictionary defines the word as “to mention something or
somebody as an example to support an argument or help explain
what is being said.”
[2] The meaning of
SAMP has changed since Tim
Berners-Lee defined it in 1993. At that time it was used
to denote a sequence of literal characters.
Some people have said that names of elements and attributes
in (X)HTML should be marked up as SAMP,
but I disagree with that based on the current definition. To
complicate matters, this is how the element is used on the very page
that gives the specification. So it’s either a matter of
“Do as I say, not as I do” or “Take your choice.”
[3] I use KBD
also to indicate the use of certain keys such as Ctrl
or Alt since this denotes a keyboard function, although
it does not exactly refer to text entered by the user.
Again, the usage is not clearly defined; it’s your call.
[4] MSN Learning & Research
Dictionary defines four
types of abbreviations: shortenings, contractions, initialisms
and acronyms. The first two types probably don’t need to be
marked up in most cases, since anyone who knows the language will
be familiar with them (examples include Dr. = doctor, St. = saint).
Initialisms (pronounced as separate letters) should be marked up as
ABBR, and acronyms (pronounced as words rather than
as a series of letters) should be marked up as ACRONYM.
A title attribute should be used with ABBR
or ACRONYM at the first appearance of the enclosed term
on a given page (or section of a page) to provide the expanded meaning
of the term (for example,
<abbr title="World Wide Web">WWW</abbr>).
Most browsers will provide a little pop-up tooltip containing this
information, when the user hovers a mouse over the item.
Fixing What Breaks in IE
A few standard elements of HTML are unsupported or poorly supported by Internet Explorer 4, 5 and 6. Remedies for this problem are presented here along with some styling for better browsers.
The q element
NOTE: This site no longer uses the innerHTML
JavaScript mentioned below because it makes the screen flicker during page
load, which can become quite annoying. To satisfy Internet Explorer (which
is used by 90% of our visitors), all Q elements on the site
have been replaced by curly quotation marks, and all ABBR
elements have been replaced by ACRONYM, effective with
Version 10.6 of this site.
The q element, used to mark up short quotations,
is recognized as an element by Internet Explorer and can be styled,
but Internet Explorer will not automatically enclose it in
quotation marks, as better browsers do. Some webmasters will style
it as italic text, but I have come up with a more effective solution:
JavaScript using the proprietary innerHTML method.
The script, part of which appears below, actually replaces
the <q> tags with curly quotation marks; even JAWS
for Windows, the popular screen reader, will read them as quotes.
Replacing the <q> tags is better than
including them with the quotation marks, according to the W3C spec:
Authors should not put quotation marks at the beginning and end of the content of a
Qelement.
The sample script follows. Do not use this; it will not work! Much of it has been omitted to make it easier to follow (Omissions are denoted by three dots ...). Please use the full version of the script instead.
function fixIE() {
var oldBodyText, reg, second... reg3, fourth... newBodyText
oldBodyText = document.body.innerHTML;
// IE6 won't supply quotation marks with Q,
// so replace the tags with quotation marks.
// Find Q tags with attributes
// (use class="nested" for nested quotes)
reg = /<Q([^>]+)>([^<]+)<\/Q>/g;
second = oldBodyText.replace(reg, '‘$2’');
...
// Find normal Q elements with only text inside
reg3 = /<Q([^>]*)>([^<]+)<\/Q>/g;
fourth = third.replace(reg3, '“$2<SPAN>”</SPAN>');
...
// now replace original HTML with amended HTML
document.body.innerHTML = newBodyText;
}
window.onload = function(){fixIE()};
First, the script replaces any <q> tags
containing attributes, with ‘ and
’—the entity codes for single
curly quotation marks. I add the attribute class="nested"
to the <q> tag whenever I nest quotations, so
Internet Explorer will find it. This would otherwise be unnecessary
since a multiple selector (q>q) can be used with
CSS (which is exactly
what I do for better browsers).
Next, the script replaces all remaining <q>
tags with “ and
”—the entity codes for
normal (double) curly quotation marks. Each end quote is
wrapped in a <SPAN> tag (see styling below).
If this was the whole script, and any other tag occured within a
q element, the script would fail, and the quotation marks
would not appear. To cover this, I have added several extra steps in
the full version of the script, which runs on The Oo Kingdom. Also
included in the script is an extra step (based on a script by
Jacques
Distler) which makes <abbr> and
<acronym> work on Internet Explorer. A
second script is provided which hides the main script from other
browsers that can choke on it.
I tried to provide special styling for Internet Explorer in case
the script fails, but this messes up IE 5 terribly. I do
provide styling for better browsers such as Netscape 7, Opera 7
and Mozilla Firebird, which normally render all q
elements with straight quotes, even if they are nested:
/* use curly quotes around Q in good browsers */
q:before {content: "\201C";}
q:after {content: "\201D";}
q>q:before {content: "\2018";}
q>q:after {content: "\2019";}
/* nix end quote if it flows into the next paragraph */
.continue q:after {content: "";}
.continue q>q:after {content: "\2019";}
.continue span {display: none;} /* for IE */
The quotation marks are styled for better browsers as curly
quotes; single curlies are provided for nested quotations
(hence the q>q).
Finally, all browsers are forced to observe an often forgotten
grammatical rule: if a quote continues into the next paragraph,
the end quote should be omitted. The .continue q>q:after
line ensures that nested quotes do not also receive this treatment.
The last line is for Internet Explorer; the script wraps each
end quote in a <span> tag for styling.
By the way, the Lynx
text browser handles the q element perfectly,
right down to nested q elements and single curly quotes,
with no help from style sheets at all.
It does not, however, observe the grammatical rule stated above;
no browser can know whether or not a quotation is continued into
the next paragraph without being told so. I use
class="continue" to denote this. Internet Explorer will
be fouled up if I add the class attribute to the
<q> tag, so I wrap the quote in a
<span> tag or else add the class
attribute to the p element containing the quotation.
I know this is a hair-splitting issue, but it does make for
clearer reading in the rare instances where this occurs. The W3C stipulates that
q “is intended for short quotations (inline content)
that don’t require paragraph breaks.” But the alternative
would be to use blockquote, which is not always
appropriate in such instances.
Making <abbr>
and <acronym> work right
Good browsers, such as Netscape 7, Opera 7 and Mozilla Firebird,
support the following CSS
snippet. It finds any abbr or acronym
elements containing title attributes, and gives them
a dotted underline and a question mark cursor on hover.
abbr[title], acronym[title] {
border-bottom: 1px dotted; cursor: help;}
Internet Explorer supports acronym but not
abbr. It also doesn’t support the CSS2 attribute
selectors used in the style sheet. So I use innerHTML
JavaScript (included with the full version cited above)
to wrap the content of the abbr or acronym
with <span class="abbr"> which can then be styled
with the CSS below. For some reason, IE will miss the styling if it
isn’t on a separate line, so don’t try to combine the two lines!
span.abbr {border-bottom: 1px dotted; cursor: help;}
It’s also a good idea to provide aural styling for these elements, for the benefit of screen readers:
abbr {speak: spell-out;}
acronym {speak: normal;}
Conclusion
Using phrase elements in (X)HTML
will ensure that the proper meaning is provided in each case.
This aids not only users of visual browsers but alternative
Internet devices as well. In addition, using the q
element keeps the markup less cluttered than using the longer
entity codes for curly quotation marks, which many of us do. In
short, it pays to use “the King’s (X)HTML”!