Skip to content

Instantly share code, notes, and snippets.

@Ayrx
Created June 16, 2013 08:23
Show Gist options
  • Save Ayrx/5791379 to your computer and use it in GitHub Desktop.
Save Ayrx/5791379 to your computer and use it in GitHub Desktop.
This is a simple python script to scrape comments from a web page.
#!/usr/bin/env python
import argparse
import urllib
import urlparse
from HTMLParser import HTMLParser
arg_parser = argparse.ArgumentParser()
arg_parser.add_argument('url')
args = arg_parser.parse_args()
class MyParser(HTMLParser):
def handle_comment(self, data):
print 'Line ' + str(self.getpos()[0]) + ' ' + data
html_parser = MyParser()
try:
content = urllib.urlopen(args.url).read()
html_parser.feed(content)
except IOError:
print 'Please ensure that the URL is formatted properly'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment