Skip to content

Instantly share code, notes, and snippets.

@marktyers
Last active August 29, 2015 13:56
Show Gist options
  • Save marktyers/8943106 to your computer and use it in GitHub Desktop.
Save marktyers/8943106 to your computer and use it in GitHub Desktop.
#python code showing how to parse an RSS feed using the DOM #321COM
from xml.dom.minidom import parseString, Node
import urllib
# http://www.highways.gov.uk/news/connect/rss-feeds/traffic-information-rss-feeds/
url = 'http://hatrafficinfo.dft.gov.uk/feeds/rss/CurrentAndFutureEvents/West%20Midlands.xml'
f = urllib.urlopen(url)
xml = f.read()
xmldoc = parseString(xml)
#print xmldoc.toprettyxml()
#print xmldoc
title = xmldoc.getElementsByTagName('channel')[0].getElementsByTagName('description')[0]
print '===================================================='
print title.firstChild.nodeValue
print '===================================================='
items = xmldoc.getElementsByTagName('item')
print '----------------------------------------------------'
for item in items:
category = item.getElementsByTagName('category')[0]
description = item.getElementsByTagName('description')[0]
#print description.toprettyxml()
print category.firstChild.nodeValue
print description.firstChild.nodeValue
print '----------------------------------------------------'
@marktyers
Copy link
Author

Loads entire XML feed into a single DOM object. Not suitable for huge XML datasets. You should use a SAX approach if the dataset is huge otherwise you will run out of memory...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment