Skip to content

Instantly share code, notes, and snippets.

@billday
Created February 16, 2011 15:45
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save billday/829585 to your computer and use it in GitHub Desktop.
Save billday/829585 to your computer and use it in GitHub Desktop.
csvtopics = open("devzone.topics.csv", "rb")
topicreader = csv.reader(csvtopics, dialect='excel')
csvnumitems = open(devzonedir+"devzone.topics.items.csv", "wb")
numitemswriter = csv.writer(csvnumitems, dialect='excel')
for topic in topicreader:
currenttopic = topic[0]
topicfile = (currenttopic.replace(' ', '')).replace('.', 'dot')
csvcurrenttopic = open(devzonedir+"devzone.analysis.topic."+topicfile+".csv", "wb")
topicwriter = csv.DictWriter(csvcurrenttopic, fieldnames=['pubDate', 'articleOrBlog', 'title', 'link', 'hitCount'], restval='', extrasaction='ignore', dialect='excel')
csvinput.seek(0)
items = 0
for row in itemreader:
if re.search(currenttopic,row['title']) or re.search(currenttopic,row['description']):
topicwriter.writerow(row)
items += 1
numitemswriter.writerow([currenttopic, items])
print topicfile, "topic contains", items, "items"
csvcurrenttopic.close()
@billday
Copy link
Author

billday commented Feb 16, 2011

This code excerpt is discussed in the article "Divining DevZone Insight from Filtered Feeds and Deep Pages: Part 2, The Analysis" at http://bit.ly/gXiAny

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment