In May of 2001, I converted this site from HTML 4.01 to XHTML 1.0. The pages will look identical to most viewers, but they are now blessed with better future compatibility as well as continued backwards compatibility (with a few tweaks; see below). In other words, older browsers will still render the pages correctly, but new browsers not yet developed will also render the pages correctly.
XHTML (Extensible Hypertext Markup Language) is the next step in the evolution of the Internet. It conforms to the new standard of markup languages (XML) but is designed to work as well (at least for now) in existing user agents (browsers).
The biggest differences I have seen (in practice) between writing HTML and XHTML are these:
- Elements and attributes may be written in uppercase letters
in HTML but not in XHTML, because they are case-sensitive. In other
words,
<UL><LI>is acceptable markup in HTML, but in XHTML, it must be written as<ul><li>. - Unlike the examples on my HTML Elements
page, where end tags are not necessary for “empty” elements
(such as
img,brandhr), in XHTML all elements require end tags. To avoid messing up older browsers with those end tags (<br></br>may not render properly in some of them), the ending slash can be combined with the start tag like this:<br />. The slash should be separated from thebrby a space, since<br/>will still break some older browsers. For more detailed information on making XHTML backwards compatible, see http://www.w3.org/TR/xhtml1/#guidelines. - All attributes have to appear in quotes now, not just those containing special characters.
- In converting this site from HTML 4.01 to XHTML 1.0, I found that the document length increased by an average of about 1%, not enough to affect load time.
- The W3C’s
Validator is better for XHTML. The HTML
Validator will occasionally let an error squeak by unchecked.
This happened on our OK… So What’s an Ummamum?
page, where a wrongly defined character (
&instead of&) left uncorrected for months was suddenly caught when I converted the site to XHTML! The W3C is constantly upgrading their service, so this may no longer be true. - Doctype declaration
looks different than it does for HTML.
- The first line is a prologue which defines the version number and character encoding for XML; this will be used in years to come, and is not yet mandatory but is recommended by the W3C (as I am writing this, we do not use the XML prologue on this site because Internet Explorer 6 thinks it is the doctype, so it renders the page in quirks mode instead of standards-compliant mode. In addition, Internet Explorer 4 for Macintosh may render a blank page when it encounters the XML prologue. Some versions are said to render the source code as text.)
- The second line is the public doctype declaration which describes to the agent (browser) how it should parse the document.
- The third line is the system doctype which points to a document type definition online.
- The
xmlnsattribute on the last line is new; it defines the namespace for this version of XHTML.
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" lang="en">
There are other differences too. A few old elements and attributes
have been deprecated in favor of new ones. One example is
the name attribute for the a element;
you have to use id now instead of name.
(Some browsers don’t recognize the id attribute,
so it’s a good idea to use both.) But much of it is the same
as before.
Using the correct MIME type
Every computer file has a MIME type,
whether it is a text file, an image file, an audio file, a Web
page document or whatever. MIME types have
been around longer than the Internet itself. Normal HTML takes on the
MIME type text/html.
According to the W3C specification,
XHTML
1.1 should not be delivered as the MIME
type text/html but as application/xhtml+xml.
XHTML 1.0 may be served as text/html
to allow for backwards compatibility during this transitional period.
This site uses the mod_rewrite module in the
.htaccess file on the server, to send
XHTML 1.0 Strict as application/xhtml+xml
to the best browsers (such as Netscape 7 and Mozilla) and
text/html to everyone else.
Another way to send XHTML under the preferred
MIME type is to change the file extensions
from to .xhtml. Only Netscape
6/7, Mozilla, and Opera 6/7 will be able to open the files!
Internet Explorer 6 will ask permission to download the file
instead of displaying the page, because it doesn’t
know what application/xhtml+xml is.
If my Web host would do
content negotiation,
I could offer two separate pages with the same file name: one in
HTML 4.01
Strict with a extension, and the other in
XHTML 1.1 with a .xhtml extension. The links to
the page would not specify the file extension, so browsers would
get whichever page was best. I tried it on the Web space my
ISP gives
me—oddly they do content negotiation—and it worked
perfectly! So why not have my domain point there? They only
give me 10 megabytes of Web space, which is not enough for this
site.
External links on XHTML
- XHTML 1.0: The Extensible HyperText Markup Language at the W3C website
- XHTML Overview (December 6, 2000) at Webmonkey
- Better Living Through XHTML by Jeffrey Zeldman (February 15, 2002) at A List Apart
- The Road to XHTML 2.0: MIME Types by Mark Pilgrim (March 19, 2003)