2006-09-21

HTML vs. XHTML: it's all in the Content-Type

Well, hell if I knew that the DOCTYPE of a web page didn't specify how a page would be processed. I was reading Surfin' Safari (blog by Apple Safari developers) and was promptly schooled on how to serve XHTML. Turns out that XHTML as served with a ``text/html`` content-type leads to it being parsed as a bastardized version of HTML 4. One must use ``application/xhtml+xml`` in order for a browser to properly parse it as XML.

But with this problem being a part of the web, it doesn't quite work that way. Turns out IE 6 treats any file with the ``application/xhtml+xml`` content-type as something to download. Mozilla also varies how it parses a file based on the content-type. In order to allow people to continue to serve IE users and to make sure that web developers only have to care about one way their page will be parsed and rendered, they suggest using ``text/html``. Mark Pilgrim has a good article on this whole topic here.

But the thing that really sparked this post was the comment in the Surfin' Safari entry about using HTML 4 instead of XHTML 1.0 . That feels like staying in the past because trying to transition to the future should be all or nothing. I would much rather serve up XHTML as ``text/html`` to get used to the style, and when the dominant browser out there properly recognizes ``application/xhtml+xml``, switch over to doing proper XHTML serving of documents. Seems like a smoother transition and gets me out of possible bad habits that HTML 4 can promote. I know I plan to continue to use XHTML 1.0 and serve it as ``text/html`` until that day when users catch up with newer browsers and we can all transition over to XHTML 2.0 with ``application/xhtml+xml`` content-types.