Skip to content

Instantly share code, notes, and snippets.

@kenneth
kenneth / gist:7539784
Created November 19, 2013 03:22
爬虫调试,链接reset
# python gistfile1.py
got no cookie
263667加入线程2013-11-19 11:17:46
thread_Thread-1 doing task 263667
263667采集数据开始
fail op. task = {'url': '/qqsign/ss.php?', 'method': 'GET', 'headers': {'Referer': 'http://www.qqxoo.com/main.html?qqid=263667', 'Host': 'www.qqxoo.com', 'X-Requested-With': 'XMLHttpRequest', 'Accept': 'text/html, */*', 'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.92 Safari/537.1 LBBROWSER'}, 'params': {'qqid': '263667'}, 'conn_args': {'host': 'www.qqxoo.com'}, 'id': '263667'}, status = 503
263667数据正则中.
263667加入线程2013-11-19 11:18:46
thread_Thread-2 doing task 263667
Exception in thread Thread-2:
@kenneth
kenneth / gist:5633351
Created May 23, 2013 07:46
a newbie regex problem?
# -*- coding:utf-8 -*-
#!/usr/bin/env python
import re
content="""
[URL="http://www.boston.com/bigpicture/2008/10/the_sun.html"]http://www.boston.com/bigpictu....html[/URL]
[url="http://www.boston.com/bigpicture/2008/10/the_sun.html"]http://www.boston.com/bigpictu....html[/url]
[URL=http://www.boston.com/bigpicture/2008/10/the_sun.html]http://www.boston.com/bigpictu....html[/URL]
[url=http://www.boston.com/bigpicture/2008/10/the_sun.html]http://www.boston.com/bigpictu....html[/url]
[url]http://www.boston.com/bigpictu....html[/url]