The poor brain boggles. Let’s see if I can make sense of this…

First Don Box switches his RSS feed to support <content:encoded> (which is what I did from early on, BTW). Then Sam Ruby gently chides him for that, proposing the use of <xhtml:body> instead, a form which I hadn’t read about previously.

Then, Don immediately agrees and posts examples – curiously enough, they mess up his <content:encoded> RSS feed to illegibility (at least in NetNewsWire). Looking at the source, I see nested <content:encoded>s and CDATAs, and many unescaped tag delimiters; all this may possibly be syntactically valid, but it’s extremely confusing; I tried to hand-parse it and didn’t go very far. FWIW NetNewsWire seems to agree with me icon_smile.gifmy own feed never nests these things. Don promised to fix this by Monday.

Comments at Sam Ruby’s post soon discuss details and a few samples appear. Sam himself updates his own feed to <xhtml:body>, saying it’s “more bandwidth friendly” than <content:encoded>, which probably won’t be true if all internal tags must also contain the xhtml: prefix, as some argue.

Meanwhile, Jorgen Thelin asks for more stability, arguing that such fast changes in the interpretation of RSS makes compliance impossible. The comments to that by Sam and Don are very thought-provoking, and I’ll read them again carefully tomorrow, before I make any changes to my feed.

Sjoerd Visscher, in the meantime, proposes using <xhtml:div> instead of <xhtml:body>; Don disagrees, saying this would not convey the meaning that this tag brackets the real content. Sam arguments that the purpose of the whole exercise is avoid making the structure of the comment opaque; he also changes his own feed to the new scheme. NetNewsWire apparently doesn’t understand it, and falls back to using the <description>.

It’ll be interesting to see what newsreader authors say about this. Greg Reinacker says he’s already made the necessary changes in NewsGator. Purely from a newsreader’s perspective, I’m not sure if Sam’s comments about opaqueness apply; newsreader software always has to try to show something, even if the feed is malformed. I suppose NetNewsWire, for instance, whenever it sees a <content:encoded> tag it just shoves the contents into the lower-right pane, trusting the built-in HTML parser to do the right thing. And once Brent switches over to Apple’s upcoming WebCore, he’ll have even less to worry about. Meanwhile, I’m not sure supporting xhtml: prefixes in NetNewsWire will be trivial; he’ll probably have to prescan and take them out…

On the other hand, I agree that making the contents more structured may help Feedster and similar efforts.

Regarding the <xhtml:div> vs. <xhtml:body> question, my (probably naïve) first reaction is that <xhtml:div> will make things easier for browser-based aggregators, as the contents will be easily insertable into another page; whereas <xhtml:body> tags will have to be removed or converted, and also must contain block elements… isn’t it easier to treat the contents as a div and add an implicit body around it whenever necessary? For my weblog at least, a post is never displayed separately on a page, so my feed reflects exactly my top page, as a list of over a dozen posts. Perhaps both options should be allowed?

I’m looking forward to learning more about XML, XHTML and RSS from this discussion. Thanks, everybody!