Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save ResolveWang/8c0d24ebac062451b7587c1130e258e8 to your computer and use it in GitHub Desktop.
Save ResolveWang/8c0d24ebac062451b7587c1130e258e8 to your computer and use it in GitHub Desktop.
Install scrapy in CentOS 6.5 in virtualenv, and install mysql to store the data crawled. After the installation, the `spider` dir will contains all the downloaded files and the `spider_env` is the virtualenv dir
#!/usr/bin/env bash
mkdir spider
cd spider/
yum update -y
# install pip on centos6
rpm -ivh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
yum install -y python-pip
# install wget
yum install -y wget
# `ImportError: No module named pkg_resources`
yum install -y python-devel
# other may needed
yum install -y libxml2 libxml2-devel
yum install -y libxslt libxslt-devel
yum install -y openssl openssl-devel
yum install -y libffi libffi-devel
yum install -y gcc-c++.x86_64
yum install -y gcc
yum install -y atlas
yum install -y atlas-devel
yum install -y lapack-devel
yum install -y blas-devel
yum install -y zlib-dev
# install git
yum install -y git
# `error: command 'gcc' failed with exit status 1` when install mysql-python
yum install -y mysql-devel
# install mysql 版本号可能会有差异,具体参考这里http://dev.mysql.com/doc/mysql-yum-repo-quick-guide/en/
wget http://repo.mysql.com//mysql57-community-release-el6-8.noarch.rpm
rpm -Uvh mysql57-community-release-el6-8.noarch.rpm
yum install -y mysql-community-server
#`ImportError: No module named setuptools`
wget https://pypi.python.org/packages/d3/16/21cf5dc6974280197e42d57bf7d372380562ec69aef9bb796b5e2dbbed6e/setuptools-20.10.1.tar.gz#md5=cc3f063d05e3bff4d3fa07a5a1017c3b
tar -xvf setuptools-20.10.1.tar.gz
cd setuptools-20.10.1
python setup.py install
cd ..
pip install virtualenv
pip install scrapy
pip install scrapyd
virtualenv spider_env -p /usr/local/bin/python --system-site-packages
source spider_env/bin/activate
pip install --upgrade pip
# install twisted in the virtualenv before pip requirement.txt
wget https://pypi.python.org/packages/source/T/Twisted/Twisted-16.1.1.tar.bz2
tar -jvxf Twisted-16.1.1.tar.bz2
cd Twisted-16.1.1
python setup.py install
cd ..
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment