Skip to content

Instantly share code, notes, and snippets.

@caleb-kaiser
Last active October 25, 2019 15:42
Show Gist options
  • Save caleb-kaiser/f7bd3f1b48c9e98ebad259b63511e18d to your computer and use it in GitHub Desktop.
Save caleb-kaiser/f7bd3f1b48c9e98ebad259b63511e18d to your computer and use it in GitHub Desktop.
Initialize your environment
git init
dvc get \
https://github.com/iterative/dataset-registry \ tutorial/nlp/pipeline.zip
unzip pipeline.zip
rm pipeline.zip/
virtualenv -p python3 .env
source .env/bin/activate
pip3 install -r code/requirements.txt
echo -e "\n.env/" >> .gitignore
dvc init
mkdir data
dvc get \
https://github.com/iterative/dataset-registry \
tutorial/nlp/Posts.xml.zip \
-o data/Posts.xml.zip
dvc add data/Posts.xml.zip
git add .
git commit -m 'Add dataset archive'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment