Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Installation instructions for doing data science in a Python environment on Ubuntu. We'll install base packages like numpy, scipy, scikit-learn and pandas. We also install the IPython Notebook interactive environment. This is a best practice recommendation for doing research-type work. We make use of virtualenvwrapper, but don't show how to inst…
mkvirtualenv datascience
sudo apt-get install python-scipy libblas-dev liblapack-dev gfortran
sudo apt-get install libffi-dev # for cryptography from scrapy
sudo apt-get install libxslt-dev # for libxml from scrapy
export BLAS=/usr/lib/libblas.so
export LAPACK=/usr/lib/liblapack.so
pip install numpy
pip install scipy
pip install scikit-learn
pip install pandas
pip install patsy
pip install statsmodels
pip install ipython tornado pyzmq
pip install networkx
pip install gensim
pip install scrapy
pip install numexpr bottleneck
pip install sqlalchemy
pip install nltk
pip install seaborn
@startakovsky

This comment has been minimized.

Copy link

commented Jun 28, 2015

Do the export statements stick after reboot? Do they need to be added to a .profile, what are they for?

@startakovsky

This comment has been minimized.

Copy link

commented Jun 28, 2015

Also, it seems like numpy and scipy install automatically with the sudo apt-get install python-scipy

@alvations

This comment has been minimized.

Copy link

commented Apr 4, 2016

@startakovsky, both command install scipy but they're somewhat different. Using sudo apt-get python-scipy gives you the most stable version that the OS supports but pip install scipy installs the latest version that scipy pushed in to the pypi index.

@MachineTruth

This comment has been minimized.

Copy link

commented Jul 12, 2017

Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.