Skip to content

Instantly share code, notes, and snippets.

View recall704's full-sized avatar
💭
I may be slow to respond.

recall704 recall704

💭
I may be slow to respond.
View GitHub Profile
@recall704
recall704 / gist:d80a298e2af679429dd4
Created May 1, 2015 06:27
gzip 压缩的网页解码 gbk
url = 'http://www.xxoo.com/'
req = urllib2.Request(url)
req.add_header('Accept-Encoding', 'gzip, deflate');
f = urllib2.urlopen(req, timeout=30)
html = f.read()
if html[:6] == '\x1f\x8b\x08\x00\x00\x00':
html = gzip.GzipFile(fileobj = cStringIO.StringIO(html)).read()
html.decode('gbk')
a = html.decode('gbk')
print a
@recall704
recall704 / pachong.py
Created May 7, 2015 13:55
爬取 http://pachong.org/ 中的代理ip 和 port
#coding:utf-8
import re
import requests
url = 'http://pachong.org/high.html'
req = requests.get(url)
if req.status_code == 200:
html = req.text
@recall704
recall704 / result
Last active August 29, 2015 14:21
Getting the caller function name inside another function in Python?
i am function g
i am function f
('my parent function is', 'g')
i am function k
i am function f
('my parent function is', 'k')
#coding:utf-8
import os
import urllib
from pyquery import PyQuery
# 下载某个 url 中的所有图片 到 指定目录
def get_img_and_save(url, target_dir):
pyobj = PyQuery(url)
img_objs = pyobj('img')
#!/usr/bin/env python
#-------------------------------------------------------------------------------
# Name: raw_http.py
# Purpose: construct a raw http get packet
#
# Author: Yangjun
#
# Created: 08/02/2014
# Copyright: (c) Yangjun 2014
#-*- coding: UTF-8 -*-
# author : recall404
# email : tk657309822@gmail.com
"""
xls 文件对应的列由字母转换为数字方便 xlrd 等读取数据
源代码是从 XlsxWriter copy
并做了两个小修改
1. 忽略大小写
@recall704
recall704 / supervisord.service
Created March 28, 2016 07:06 — forked from tonyseek/supervisord.service
Running supervisord with systemd.
[Unit]
Description=supervisord - Supervisor process control system for UNIX
Documentation=http://supervisord.org
After=network.target
[Service]
Type=forking
ExecStart=/usr/bin/supervisord -c /etc/supervisord.conf
ExecReload=/usr/bin/supervisorctl reload
ExecStop=/usr/bin/supervisorctl shutdown
{
"default_line_ending": "unix",
"draw_white_space": "all",
"font_face": "Source Code Pro",
"font_size": 18,
"ignored_packages":
[
"Jedi - Python autocompletion",
"Vintage"
],
# -*- coding: utf-8 -*-
import re
import scrapy
from scrapy.utils.response import get_base_url
from pyquery import PyQuery
from biquge.items import BiqugeItem
#!/bin/bash
docs_dir=/home/recall/docker_data/nginx/scrapy-docs-cn
git=/usr/bin/git
branch=master
cd $docs_dir
$git reset --hard origin/$branch
$git clean -f