HTML-5, not XHTML, is the future of the web.
To X, or Not to X…? Just Say No to XHTML.
Shop for web design and development books here.
What a shock: All this time I thought the web was migrating toward XHTML. I updated several of my sites to XHTML 1.0 Strict, and was real proud of it—yessiree, I done lernt me all that new-fangled XHTML stuff…
Then I woke up and read this and this expert material, and found out that:
- XHTML is not the future of the web.
- Major Consortium members are moving toward HTML5, not XHTML2.
- XHTML is served by nearly all public web servers as text/html, not application/xhtml+xml. Thus, the browser ignores the xml wrapper and treats the contents just like regular HTML.
- Gurus were writing about the problems of serving XHTML as text/html more than eight years ago, and nothing has changed, except that millions of developers have unknowingly embraced it.
- Few browsers will render XHTML served as xml, even though XHTML was defined 10 years ago.
- XHTML 1.0 Transitional, so frequently used now, is no better than HTML4 Transitional, which in some browsers is no better than no doctype at all.
- XHTML 1.1 is virtually useless: it’s always served as xml, and most browsers won’t eat it, so they silently drop back to HTML rendering, or just spit out a DOM table.
- XHTML offers no advantages or capabilities over proper HTML4 Strict.
Hmmm… that makes sense. I always wondered just what I was missing—surely there’s gotta be more to XHTML that just that extra space and slash… But as it turns out, nothing at all.
“Many people prefer to use XHTML because of the advantages XML brings for editing or processing of documents. However, there is still a lack of support for XML files in mainstream browsers, so many XHTML 1.0 files are actually served using the text/html MIME type. In this case, the user agent will treat the file as HTML.” w3.org
About the XML Prolog Problem
XHTML documents are supposed to start with this on the first line:
<?xml version="1.0" encoding="utf-8"?>
Here’s the low-down on the xml prolog (derived from w3.org):
- XHTML may be served as HTML or XML
- To serve as HTML, remove the xml prolog and add <meta http-equiv="Content-type" content="text/html;charset=UTF-8" /> to the head.
- To serve as XML, add the xml prolog <?xml version="1.0" encoding="utf-8"?> and use the MIME type application/xhtml+xml. This is still poorly supported by User Agents. Only do this when the document is actually XML, and can be served as XML.
- Regardless of what you do, the web server will probably serve the document in text/html, via the server’s content header, which takes precedence in the browser.
- Using the xml prolog messes up several browsers, particularly IE6. Here’s a browser list from webstandards.org.
My web sites were served like this, which (hopefully) made the server serve it as HTML with 8859-1 encoding:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/XHTML1/DTD/XHTML1-strict.dtd"> <HTML xmlns="http://www.w3.org/1999/XHTML" xml:lang="en" lang="en"> <head> <meta http-equiv="Content-Language" content="en-us" /> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
That worked fince, but now that I know it’s unnecessary, I’ve converted everything back to HTML4 Strict, for the reasons mentioned above. This is the one to use until HTML5 is both finalized and then well-supported:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/HTML4/strict.dtd"> <html lang="en"> <head> <meta name="language" content="en"> <meta http-equiv="Content-Language" content="en"> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
Yes, the three language directives are redundant, but using them keeps all servers and browsers happy.
- Forget the XML prolog. It’s optional, and problematic.
- Forget about putting HTML in an XML wrapper for web pages. XHTML adds absolutely nothing to well-formed HTML.
- Using XHTML 1.0 Strict for couple of years taught me to abandon all deprecated tags and attributes, moving all formatting to CSS. XHTML causes one to produce pages/files that are much cleaner and much easier to maintain. But since it is served as HTML, there is actually no advantage to using it for web pages, and HTML 4.01 Strict (and HTML-5) has the same capacity to be well-formed, clean, and fully CSS-controlled.
- Putting <!DOCTYPE html> on the first line of the file makes all modern browsers render in Strict mode. And that’s all you need to worry about. It is also the future of the web.
- Having to learn about doctypes and charsets and encodings is a royal pain… Why can’t they all get on the same page and adopt the final draft of something and then make browsers that fully support it? It’s a conspiracy to drive web developers crazy!
- Has the web now become a religion, where traditions without verity or merit must be followed as dogma? Yes, it has—and I refuse to join.