Skip to content

Instantly share code, notes, and snippets.

@jizhang
Created August 9, 2017 22:21
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jizhang/024af8e0570b1e394239855d2c9dd40d to your computer and use it in GitHub Desktop.
Save jizhang/024af8e0570b1e394239855d2c9dd40d to your computer and use it in GitHub Desktop.
Crawl CSDN Blog's Page Views
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import re
import urllib2
import datetime
html = urllib2.urlopen('http://blog.csdn.net/zjerryj').read()
mo = re.search(r'访问:<span>(\d+)次</span>', html)
with open('/home/jizhang/csdn.log', 'a') as f:
f.write('%s,%s\n' % (datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S'), mo.group(1)))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment