11 January 2013. Double-encoded markup in BBC Atom/RSS feeds.
BBC Atom feeds are double-encoding XHTML markup. But content of an <content type="xhtml">...</content>
element must be the actual markup, not escaped markup.
This bug results in reader software showing the markup rather than the formatted content.
See Atom processing model: http://tools.ietf.org/html/rfc4287#section-4.1.3.3
E.g. http://www.bbc.co.uk/blogs/radio4/atom/ contains an entry like this (edited for clarity). Note the <p>
tag is encoded as <p>
.
<entry xmlns:xhtml="http://www.w3.org/1999/xhtml">
<title type="html"><![CDATA[Generations Apart]]></title>
<summary type="html"><![CDATA[<p>Putting Generations Apart together is like a massive jigsaw puzzle. Not one of those ones with huge amounts of blue sky and an annoying lake that’s almost the same colour, but definitely one that has something that appears to be an ever changing vista in the middle. </p>
]]></summary>
<published>2013-01-10T17:18:50+00:00</published>
<updated>2013-01-10T17:18:50+00:00</updated>
<link rel="alternate" type="text/html" href="http://www.bbc.co.uk/blogs/radio4/posts/Generations-Apart"/>
<id>http://www.bbc.co.uk/blogs/radio4/posts/Generations-Apart</id>
<author>
<name>Fi Glover</name>
</author>
<content xmlns:xhtml="http://www.w3.org/1999/xhtml" type="xhtml">
<xhtml:div xmlns:xhtml="http://www.w3.org/1999/xhtml"><p><em>Fi Glover presents Generations Apart, ...</xhtml:div>
</content>
</entry>