Skip to content

Instantly share code, notes, and snippets.

@walterreade
Last active October 9, 2022 06:41
Show Gist options
  • Save walterreade/605e97ded7d0f81632c2 to your computer and use it in GitHub Desktop.
Save walterreade/605e97ded7d0f81632c2 to your computer and use it in GitHub Desktop.
Setting up standard virtualenv
### for brand-new only
sudo apt-get update
sudo apt-get install htop
sudo apt-get install build-essential
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
rm Miniconda3-latest-Linux-x86_64.sh
source .bashrc
conda update conda
conda update --all
conda create --name <projname>
# create /mnt/ssd data folder
sudo mkdir /mnt/ssd/kaggle-PROJNAME
sudo chown inversion /mnt/ssd/kaggle-PROJNAME
# create GS data bucket
gsutil mb -c regional -l us-central1 gs://kaggle-PROJNAME
# create the github repo
cd KaggleCompetitions
git pull
mkdir PROJNAME
git add PROJNAME
# create alias
vim ~/.bashrc
G, yy, p, <edit>, <save>
source .bashrc
source activate <projname>
conda install cython scikit-learn pandas seaborn jupyter joblib patsy statsmodels h5py scikit-image
conda install -c conda-forge jupyter_contrib_nbextensions
pip install image_match --no-deps
conda install -c conda-forge tqdm
# Elastic Search and Dependencies (for python)
conda install -c conda-forge elasticsearch
# To install Elastic Search driver for the first time . . .
java -version
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer
java -version
curl -L -O https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.5.1.tar.gz
tar -xvf elasticsearch-5.5.1.tar.gz
./elasticsearch-5.5.1/bin/elasticsearch
## Xgboost
git clone --recursive https://github.com/dmlc/xgboost
cd xgboost
git pull --recurse-submodules
git submodule update --recursive
make -j4
cd python-package
python setup.py install
# Facets visualization ~~~~~
# git clone https://github.com/PAIR-code/facets
cd facets
git pull
jupyter nbextension install facets/facets-dist/ --user
conda install -c anaconda protobuf
cp ~/facets/facets_template.ipynb ~/KaggleCompetitions/PROJNAME
# for large amounts of data, start notebook with (actually, use the jn alias!):
jupyter notebook --NotebookApp.iopub_data_rate_limit=10000000
# put in notebook for wide layout
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))
# ~~~~~
# How to do a sparse clone
git init KaggleCompetitions
cd KaggleCompetitions
git remote add origin https://github.com/Kaggle/KaggleCompetitions
git config core.sparsecheckout true
echo "HappyWhale/*" >> .git/info/sparse-checkout
git pull --depth=1 origin master
jupyter notebook --generate-config
jupyter notebook password
copy from json to jupyter_notebook_config.py
#jupass=`python -c "from notebook.auth import passwd; print(passwd())"`
#echo "c.NotebookApp.password = u'"$jupass"'" >> $HOME/.jupyter/jupyter_notebook_config.py
echo "c.NotebookApp.ip = '*'
c.NotebookApp.open_browser = False" >> $HOME/.jupyter/jupyter_notebook_config.py
https://www.digitalocean.com/community/tutorials/how-to-partition-and-format-storage-devices-in-linux
sudo mkdir -p /mnt/ssd
sudo mount -o defaults /dev/sdb1 /mnt/ssd
sudo vi /etc/fstab
LABEL=ssd /mnt/ssd ext4 defaults 0 2
jupyter notebook
subl .bashrc
alias projname='source activate projname; cd /media/walter/Data/kaggle_projname/code'
conda update conda
conda update --all
conda create --name projname python=3.5
source activate projname
# source deactivate
conda install cython scikit-learn pandas seaborn jupyter joblib patsy statsmodels h5py scikit-image
conda install -c conda-forge jupyter_contrib_nbextensions
conda install sypder
# http://localhost:8888/nbextensions
#conda install -c menpo opencv3=3.1.0
#conda install pillow
# conda install -c sebp scikit-survival
# pip install lifelines
pip install natsort
pip install feather-format
sudo apt-get update
sudo apt-get install build-essential
## XGBoost ##
#git clone --recursive https://github.com/dmlc/xgboost
cd xgboost
git pull --recurse-submodules
git submodule update --recursive
make -j4
cd python-package
python setup.py install
## Theano ##
# pip install --upgrade --no-deps git+git://github.com/Theano/Theano.git
cd Theano
git pull
python setup.py install
pip install git+git://github.com/danielhomola/boruta_py@master --no-deps
pip install git+git://github.com/fchollet/keras@master --no-deps
pip install git+git://github.com/pymc-devs/pymc3@master --no-deps
pip install git+git://github.com/mcmcplotlib/mcmcplotlib.git@master --no-deps
pip install git+git://github.com/deap/deap --no-deps
#conda install -c r r-essentials
cd to project folder
pip freeze > requirements.txt
# ~~~~~~~~~~~ SLOTH ~~~~~~~~~~
To install sloth, you'll need pyqt4. If you're using anaconda, you can install using:
conda install -c anaconda pyqt=4.11.4
conda install scikit-image
To run:
sloth --config fish_config.py annotations_train.json
Generating the .json file:
find ../input/train/ -iname "*.jpg" | sort | xargs sloth appendfiles annotations_train.json
find ../input/test_stg1/ -iname "*.jpg" | sort | xargs sloth appendfiles annotations_test_stg1.json
in windows:
python c:\Anaconda3\envs\sloth\Lib\site-packages\sloth\bin\sloth example1_labels.json
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment