
The Road to XHTML 2.0: MIME Types
XHTML's Dirty Little Secret
Let's pretend that you've migrated to XHTML -- probably XHTML 1.0
Transitional, unless you're one of those weird geek alpha designers who
insist on doing everything with Strict DOCTYPEs. It wasn't that hard,
right? Lowercase all your tags; add some end tags to match your
<p> and <li> tags; add some slashes
to <br /> and <img />; update your
DOCTYPE; get on with your life!
Let's also pretend, for the sake of argument, that you're validating your spiffy new XHTML markup on a regular basis. You might even have one of those sporty "valid XHTML" badges lurking at the bottom of pages. Good for you.
Now here's a dirty little secret: browsers aren't actually treating
your XHTML as XML. Your validated, correctly DOCTYPE'd, completely
standards compliant XHTML markup is being treated as if it were still HTML
with a few weird slashes in places they don't belong (like <br
/> and <img />).
Why? The answer is MIME types.
MIME types are as old as the Web; in fact, they're older. Every page
you browse, every image you download, every stylesheet and JavaScript and
PDF and silly little Flash movie you view through your browser, all have a
MIME type associated with them. For HTML pages, the MIME type is
text/html. For XHTML, the MIME type is supposed to
be
application/xhtml+xml.
(Tip: If you use the advanced page of the W3C validator with the "verbose output" option checked, it will validate your page and show you what MIME type your server is sending for that page.)
The current MIME type situation is a bit of a mess. According to the W3C's Note on XHTML Media Types:
- HTML 4 should be served as
text/html. This is what everybody does, so no problem there. - "HTML compatible" XHTML (as defined in appendix C of the XHTML 1.0
specification) may be served as
text/html, but it should be served asapplication/xhtml+xml. This is probably the sort of XHTML you're writing now, so you could go either way. - XHTML 1.1 should not be served as
text/html. - Although the spec is not finalized yet, all indications are that XHTML
2.0 must not be served as
text/html.
So the first step on the road to XHTML 2.0 is conquering the XHTML MIME
type, application/xhtml+xml.
A Messy Transition
You can start using the application/xhtml+xml MIME type
immediately for your existing XHTML pages, but there are a few serious
caveats you need to consider first:
All of your pages must be well-formed XML. Technically, they don't need to be valid XHTML (you could have a
<div>element inside a<span>element and be well-formed but invalid). But all your end tags must match all your start tags, no overlaps, none missing.When I say must, I mean must. Mozilla and its derivatives are the only major browsers that can handle the XHTML MIME type (more on that in a minute), and they are ultra-strict about it. If a single end tag is missing, Mozilla users won't see your page at all; they'll see an XML debugging message instead.
Most current browsers don't handle the
application/xhtml+xmlMIME type correctly, so you'll need to make provisions for serving up your XHTML the old-fashioned way (astext/html) to these browsers. (The list of non-XHTML-aware browsers includes Internet Explorer 6 for Windows, so it's not as if you can skip this step.) If your pages are dynamically generated, you can alter theContent-typeprogrammatically. If you're serving up static files, you'll need to resort to mod_rewrite or a similar solution. More on this in a minute, too.Cascading stylesheets are parsed slightly differently in the XML world. When attached to HTML pages, CSS selectors are case-insensitive. But when attached to XML pages (including XHTML pages served with the proper XHTML MIME type), CSS selectors are case-sensitive. This shouldn't come as too much of a surprise; everything in XML is case-sensitive. Keep all your CSS selectors lowercase and you'll be okay.
Also on the subject of CSS, the
<body>element is somewhat magical in HTML, but not in XML. The technical background is not worth delving into; the upshot is that if you define CSS styles onbody, you should define them onhtmlas well. For example, if you define a background color onbody, it will apply to the entire page in HTML, but it may not in XML. You'll need to define the background onhtmlas well.Your JavaScript may need some tweaking for case-sensitivity as well. Whereas the HTML DOM is case-insensitive (and tag names are returned from functions like
getElementsByTagName()in uppercase), the XML DOM is case-sensitive and tag names are returned in lowercase. To quote the W3C on XHTML and the HTML DOM:Developers need to take two things into account when writing code that works on both HTML and XHTML documents. When comparing element or attribute names to strings, the string compare needs to be case insensitive, or the element or attribute name needs to be converted into lowercase before comparing against a lowercase string. Second, when calling methods that are case insensitive when used on a HTML document (such as getElementsByTagName() and namedItem()), the string that is passed in should be lowercase.
Also on the JavaScript front, methods like
document.writedo not work; you will need to usedocument.createElementNSand friends instead. For example, if your XHTML-as-HTML document currently uses this script to insert a linked stylesheet:if (document.getElementById) { document.write("<link rel=\"stylesheet\" type=\"text/css\" href=\"/css/js.css\" media=\"screen\" />") }Your XHTML-as-XML document would need to use something like this instead (thanks to Experts Exchange for this code):
if (document.getElementById) { var l=document.createElementNS("http://www.w3.org/1999/xhtml","link"); l.setAttribute("rel", "stylesheet"); l.setAttribute("type", "text/css"); l.setAttribute("href", "/css/js.css"); l.setAttribute("media", "screen"); document.getElementsByTagName("head")[0].appendChild(l); }Still on the JavaScript front, collections like
document.images,document.applets,document.links,document.forms, anddocument.anchorsdo not exist when serving XHTML as XML. You'll need to use the more genericdocument.getElementsByTagName()method and weed out the elements you're actually interested in. Mozilla bug 111514 has a long discussion on this issue.
It can be difficult to get JavasSript to work properly in both HTML and XML modes. This is a short-term problem (XHTML 2.0 only has one mode: XML), but it's a serious one. If you use any JavaScript on your pages now, you may be better off waiting to make the jump to XHTML 2.0 all at once, rather than migrating slowly.
Accommodating legacy browsers
|
More Dive Into XML Columns | |
As I mentioned, Mozilla is the only major browser that currently
handles application/xhtml+xml correctly; for all other
browsers, you'll need to serve your XHTML as HTML, using the old
text/html MIME type. (Here's
a complete list of browser conformance tests.) But you can't
browser-sniff for Mozilla (for instance, by searching for "Gecko" in the
User-Agent), because some browsers (OmniWeb, Opera) have options to lie
about who they are, and other browsers (Safari) include the magic word
"Gecko" in their User-Agent string by default. Luckily for us, HTTP has a
specific solution for this problem, one which is so elegant (compared to
the rest of this mess) that I didn't believe it would actually work until
I tried it.
Mozilla, in its infinite wisdom, will tell a server that it accepts
application/xhtml+xml in the HTTP_ACCEPT header
that it sends with every request.
That's it. All scripting environments provide access to these HTTP
headers; so, armed with this nugget of information, we can devise a
variety of ways to serve up the same page as
application/xhtml+xml to browsers that claim to support it
and as text/html to everyone else.
PHP
<?php
if ( stristr($_SERVER["HTTP_ACCEPT"],"application/xhtml+xml") ) {
header("Content-type: application/xhtml+xml");
}
else {
header("Content-type: text/html");
}
?>
Python (CGI script)
import os
if os.environ.get('HTTP_ACCEPT', '').find('application/xhtml+xml') > -1:
print 'Content-type: application/xhtml+xml'
else: print 'Content-type: text/html'
And, finally, if you're just serving up static HTML files, you can use
Apache's mod_rewrite module to dynamically change the MIME type for
conforming browsers by putting these rules in your .htaccess
file:
RewriteEngine on
RewriteBase /
RewriteCond %{HTTP_ACCEPT} application/xhtml\+xml
RewriteCond %{HTTP_ACCEPT} !application/xhtml\+xml\s*;\s*q=0
RewriteCond %{REQUEST_URI} \.html$
RewriteCond %{THE_REQUEST} HTTP/1\.1
RewriteRule .* - [T=application/xhtml+xml]
Next month: The Road to XHTML 2.0, part 2. "What happened to my IMG tag?"
- sizi mekanımız
2008-04-26 16:49:34 oyun oyna - oyun oyna
2008-04-23 11:55:24 oyun oyna - oyun oyna
2008-07-25 13:58:48 OyunOyna - Why use mod_rewrite?
2006-06-12 06:12:20 BryanK - Stupid Fix
2006-03-13 07:28:57 vickery - In Ruby (for Rails)
2005-09-29 10:27:26 blinks - Clarification
2005-09-28 08:04:40 dunxd - WTF XHTML and createElement
2005-06-08 14:24:16 wewereright1054 - Apache correction
2004-08-22 12:12:45 AdamBull - Problem in Mozilla 1.6 and FireFox 0.8
2004-02-23 23:56:04 Jim Cummins - Problem in Mozilla 1.6 and FireFox 0.8
2004-02-23 23:58:12 Jim Cummins - Problem in Mozilla 1.6 and FireFox 0.8
2004-10-26 04:12:32 findel - Problem in Mozilla 1.6 and FireFox 0.8
2004-10-26 04:13:56 findel - Problem in Mozilla 1.6 and FireFox 0.8
2004-11-30 17:50:15 zcorpan - Problem in Mozilla 1.6 and FireFox 0.8
2010-08-31 04:48:50 Name Badges - Validating PHP
2004-01-24 13:22:45 Richard Allsebrook - Validating PHP
2007-11-20 03:11:34 asilonline - Validating PHP
2010-08-28 09:33:25 chunsli - Validating PHP
2010-05-17 23:49:34 lin lin - validator.w3.org still shows text/html?
2003-05-13 03:01:00 Frank Farm - validator.w3.org still shows text/html?
2004-11-30 18:00:32 zcorpan - createElement
2003-05-03 06:42:07 Garrett Smith - Using XML mime types in IE
2003-03-31 04:53:52 David Carlisle - Using XML mime types in IE
2003-04-07 06:57:40 Steven Pemberton - Browsers that accept the XHTML MIME type
2003-03-27 02:20:01 Steven Pemberton - Browsers that accept the XHTML MIME type
2003-04-03 21:23:58 Masayasu Ishikawa - mod_write suggestion...
2003-03-27 00:17:26 Kevin Hanna - Doubt
2003-03-20 14:34:05 S�rgio Giraldo - Doubt
2003-03-21 08:01:47 Ted Pibil - Testing for MIME type
2003-03-21 06:54:20 Mark Pilgrim - Testing for MIME type
2007-11-20 03:09:46 asilonline - Testing for MIME type
2007-11-20 03:09:32 asilonline - Testing for MIME type
2008-02-01 15:46:19 cicicocuk