Skip to content

Instantly share code, notes, and snippets.

@tommorris
Created November 1, 2009 18:22
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tommorris/223649 to your computer and use it in GitHub Desktop.
Save tommorris/223649 to your computer and use it in GitHub Desktop.
comments on tweepml
I've been meaning to say this for a while: TweepML <http://tweepml.org/> is
stupid. No, really, it is.
It's based on OPML. OPML is good because it's widely implemented, but as a
format it leaves a lot to be desired. It serves it's inital use case pretty
well though.
TweepML is bad for a number of reasons.
1. The name. TweepML is a really bad name imo.
2. It uses XML Namespaces for extensibility, but does not keep the elements it
defines within a namespace.
3. No machine-readable schema has been published. It would take half an hour
for someone who knows what they are doing to write one.
4. It stores textual information inside an attribute. If you are looking at
XML and there's nothing that isn't in an attribute, the person who designed
the format you are using doesn't understand the purpose of XML.
5. Despite being, syntatically, a derivative of OPML, it's not OPML even
though it gives you no more expressive power than OPML.
So:
<?xml version="1.0" encoding="utf-8"?>
<tweepml version="1.0">
<head>
<title>Tech Bloggers</title>
<date_created>Mon, 29 Jun 2009 02:49:42 GMT</date_created>
<date_modified>Mon, 29 Jun 2009 02:49:42 GMT</date_modified>
<generator>TweepML C# Generator v. 1.0</generator>
<generator_link>http://tweepml.org/generator/csharp</generator_link>
</head>
<tweep_list>
<tweep_list title="TechCrunch">
<tweep screen_name="arrington" title="Michael Arrington (Founder)" />
<tweep screen_name="jasonkincaid" title="Jason Kincaid (Writer)" />
<tweep screen_name="robinwauters" title="Robin Wauters (writer)" />
<tweep screen_name="parislemon" title="MG Sigler (writer)" />
</tweep_list>
<tweep_list title="ReadWriteWeb">
<tweep screen_name="rww" title="Richard MacManus" />
<tweep screen_name="marshallk" title="Marshall Kirkpatrick" />
<tweep screen_name="alexiskold" title="Alex Iskold" />
</tweep_list>
<tweep_list title="TechFlash">
<tweep screen_name="johnhcook" title="John Cook" />
<tweep screen_name="toddbishop" title="Todd Bishop" />
</tweep_list>
</tweep_list>
</tweepml>
could be rewritten as OPML:
<?xml version="1.0" encoding="utf-8"?>
<opml version="2.0">
<head>
<title>Tech Bloggers</title>
<date_created>Mon, 29 Jun 2009 02:49:42 GMT</date_created>
<date_modified>Mon, 29 Jun 2009 02:49:42 GMT</date_modified>
</head>
<body>
<outline text="TechCrunch">
<outline type="link" url="http://twitter.com/arrington" text="Michael Arrington (Founder)" />
<outline type="link" url="http://twitter.com/jasonkincaid" text="Jason Kincaid (Writer)" />
<outline type="link" url="http://twitter.com/robinwauters" text="Robin Wauters (writer)" />
<outline type="link" url="http://twitter.com/parislemon" text="MG Sigler (writer)" />
</outline>
<outline text="ReadWriteWeb">
<outline type="link" url="http://twitter.com/rww" text="Richard MacManus" />
<outline type="link" url="http://twitter.com/marshallk" text="Marshall Kirkpatrick" />
<outline type="link" url="http://twitter.com/alexiskold" text="Alex Iskold" />
</outline>
<outline text="TechFlash">
<outline type="link" url="http://twitter.com/johnhcook" text="John Cook" />
<outline type="link" url="http://twitter.com/toddbishop" text="Todd Bishop" />
</outline>
</body>
</opml>
If you specifically want to mark it out as an outline of Twitterers, you could
do this:
<?xml version="1.0" encoding="utf-8"?>
<opml version="2.0" xmlns:xhtml="http://www.w3.org/1999/xhtml" xhtml:class="twitter">
<!-- etc... -->
</opml>
See what I did there? Used the XHTML class attribute to say that it's a list
of Twitterers. Really very simple. You could do something fancy with a custom
XML namespace. Effect is the same.
Anyway, TweepML guys realise that OPML provides the same functionality, since
they describe how to translate ('export') it to OPML 1.0 on their examples
page (sadly, they get it wrong - it's done as type="rss" with incorrect URLs -
they point to Twitter API XML rather than RSS - but you don't need type="rss":
you can just use type="link").
This comment by a guy called otto42 -
http://blog.tweepml.org/2009/09/announcing-tweepml-open-standard-format.html?showComment=1253203707981#c1539791266240741015
nails it:
"So, what value does having this new spec add, besides incompatibility with
all existing OPML applications?"
If you really like OPML, you do know you can extend OPML. To do that, you read
the OPML 2.0 specification. It's not very hard.
6. Better yet, you could skip the OPML-derived syntax altogether and just go
straight for (X)HTML with XOXO:
<ol class="xoxo twitter">
<li>TechCrunch
<ol>
<li><a href="http://twitter.com/arrington">Michael Arrington</a></li>
<li><a href="http://twitter.com/jasonkincaid">Jason Kincaid</a></li>
<li><a href="http://twitter.com/robinwauters">Robin Wauters</a></li>
<li><a href="http://twitter.com/parislemon">MG Sigler</a></li>
</ol>
</li>
<li>ReadWriteWeb
<ol>
<li><a href="http://twitter.com/rww">Richard MacManus</a></li>
<li><a href="http://twitter.com/marshallk">Marshall Kirkpatrick</a></li>
<li><a href="http://twitter.com/alexiskold">Alex Iskold</a></li>
</ol>
</li>
<li>TechFlash
<ol>
<li><a href="http://twitter.com/johnhcook">John Cook</a></li>
<li><a href="http://twitter.com/toddbishop">Todd Bishop</a></li>
</ol>
</li>
</ol>
With the XHTML version, you could also use microformats, so you'd get:
<li class="vcard"><a href="http://twitter.com/marshallk" class="fn url">Marshall Kirkpatrick</a></li>
If you want to express richer semantics, you can also use RDFa or combine them
with other microformats (maybe XFN).
This has the advantage of being machine readable *and* human readable - it's
just HTML.
6. dates in TweepML use RFC 1123 format. Dude, it's XML. Do the designers of
the format know absolutely nothing about XML? If you are representing dates,
especially in XML, use ISO 8601 date format. Pretty much every XML validation
schema tool is based around xsd:date, which is ISO 8601. So if you want to
make sure that the <date_created> isn't "0000-13-38 28:78:95" and the timezone
set as Atlantis, use 8601. It makes the lives of XSLT people easier, because
if you are doing XSLT, most of the time you want to just include the ISO 8601
datetime strings rather than having to convert them from 8601 format into 1123
format and vice versa.
The spec says:
"<date_created> / <date_modified> should use RFC 1123 format and be converted
to UTC timezone."
WTF?
RFC 1123 date format sucks in OPML. If you aren't maintaining backward
compatibility with OPML, there's no reason to use RFC 1123 date formats over
ISO 8601 datetimes.
7. It's now redundant. Twitter has Lists. And Lists is going to be part of the
API. TweepML is DOA for business reasons.
8. It's TOOL SPECIFIC. Hmm. Let's say you want to represent a list of Twitter
users, identi.ca users, Yammer users, FriendFeed users and Jaiku users. And
maybe the odd Facebook status update user. Sorry. TweepML only lets you
represent Twitter users. The point of a standard is to standardise, not to
come up with a new one for every single new service or platform.
9. Many of the elements have arbitrary maximum string length. contact_name is
arbitrarily limited to 50 characters. title is limited to 80. Description to
250. generator_link - a URL - is 100! Tags is limited to only ten tags.
There's no explanation as to this decision. And it's a glorious recipe for
ridiculous levels of fail involving Unicode.
10. The tags are comma separated. Yeah, get that. That so reminds me of the
'csvdata' element featured on The Daily WTF back in 2006:
http://thedailywtf.com/Articles/XML_vs_CSV__0x3a__The_Choice_is_Obvious.aspx
No mention is made of how you escape those commas - CSV style?
11. error_code and error_description. Quadruple WTF? "A machine code
indicating the type of error occurred during the request. This property should
be omitted if TweepML is being used as file or if the request was successful".
Someone hasn't understood HTTP! If there is an error, you just put it in the
HTTP header. HTTP has a way of representing almost any possible resource
state that you might need.
"TweepML can be used both, as a file storage to transfer information, or a
response protocol for a web service."
-> Like all formats.
"The only basic difference is the use of error_code and error_description in
case of protocol mode to indicate the presence of an error with the request."
-> Where are these error_codes defined? This is really dumb. A status code
would imply something reusable.
"When serving a TweepML file over HTTP, remember to set the Content-Type HTTP
Header to "text/xml". For parsers this should not make any difference, but if
you don't do that the browser might not display the file to the user
properly."
-> Err, yeah, you should follow this advice if you've decided to publish in
this WTFML. There's other reasons why you should serve content with the
correct Content-Type - namely, because you might want to do content
negotiation.
12. Language handling is not defined. Being able to say "this account we're
linking to is in English" (etc.) is useful. xml:lang exists. Use it.
I'll finish with a plea: if you are thinking of designing a "markup language"
- that is, not an actual markup language but a format which has the capital
letters ML at the end - stop. You are going to suck at it. Find someone to
tell you how you are doing it wrong.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment