public
Last active

Description of bug in BBC's feeds.

  • Download Gist
bug.md
Markdown

11 January 2013. Double-encoded markup in BBC Atom/RSS feeds.

BBC Atom feeds are double-encoding XHTML markup. But content of an <content type="xhtml">...</content> element must be the actual markup, not escaped markup.

This bug results in reader software showing the markup rather than the formatted content.

See Atom processing model: http://tools.ietf.org/html/rfc4287#section-4.1.3.3

E.g. http://www.bbc.co.uk/blogs/radio4/atom/ contains an entry like this (edited for clarity). Note the <p> tag is encoded as &lt;p&gt;.

<entry xmlns:xhtml="http://www.w3.org/1999/xhtml">
    <title type="html"><![CDATA[Generations Apart]]></title>
    <summary type="html"><![CDATA[<p>Putting Generations Apart together is like a massive jigsaw puzzle. Not one of those ones with huge amounts of blue sky and an annoying lake that’s almost the same colour, but definitely one that has something that appears to be an ever changing vista in the middle.  </p>
    ]]></summary>
    <published>2013-01-10T17:18:50+00:00</published>
    <updated>2013-01-10T17:18:50+00:00</updated>
    <link rel="alternate" type="text/html" href="http://www.bbc.co.uk/blogs/radio4/posts/Generations-Apart"/>
    <id>http://www.bbc.co.uk/blogs/radio4/posts/Generations-Apart</id>
    <author>
      <name>Fi Glover</name>
    </author>
    <content xmlns:xhtml="http://www.w3.org/1999/xhtml" type="xhtml">
      <xhtml:div xmlns:xhtml="http://www.w3.org/1999/xhtml">&lt;p&gt;&lt;em&gt;Fi Glover presents Generations Apart, ...</xhtml:div>
    </content>
</entry>

Please sign in to comment on this gist.

Something went wrong with that request. Please try again.