Skip to content

Instantly share code, notes, and snippets.

@atgmello
Last active March 22, 2020 14:41
Show Gist options
  • Save atgmello/efc54a384edb68808418d0250dd1b1dd to your computer and use it in GitHub Desktop.
Save atgmello/efc54a384edb68808418d0250dd1b1dd to your computer and use it in GitHub Desktop.
A simple document on how to organize your data for the DS4A LATAM 2020 course.

Organizing and automating the boring stuff

In this document I'll briefly go over the structure I've set up for organizing and automating the unzipping of each week's cases and installing all the required conda environments.

First off, create the main folder. I called it ds4a.

mkdir ds4a

Enter the newly created directory and then create the following three folder:

cd ds4a
mkdir docs
mkdir cases
mkdir zips

The names should make it straight forward. We'll be keeping documents in the docs folder (such as .odt, .pdf, etc). The zips directory will hold each week's zip files and cases their respective extracted folders.

To keep things even more tidy, lets create one folder for each week inside zips and cases. For now we're on Week 1, so let's just create its folder:

mkdir cases/week1
mkdir zips/week1

So far our structure looks like this.

ds4a/
├── docs
├── cases
│   └── week1
└── zips
    └── week1

Organizing is pretty much done. Now let's hop into the automation part.

First, download this week's zips and place all of them inside the zips folder.

Next, grab the unzip_all.sh script here and place it inside the zips folder as well. Make sure it's executable.

chmod +x unzip_all.sh

Now to unzip all of the zip files at once and throw them at the right place, simply run:

./unzip_all.sh week1

This will do exactly that: unzip all zips from folder zips/week1 and place the extracted folders in cases/week1.

Next, we'll want to automate the environments set up. Grab the next script here, make sure it's execute and place it inside the cases folder. Now to install all the environments from the first week's cases, run:

./set_all_env.sh week1

After a while, if everything goes well, you should be good to go! In the end, your folder structure should look somewhat like this:

ds4a/                                                                                                   
├── docs
│   ├── environment-setup.pdf
│   └── Participant+Welcome+Packet+-+DS4A+Latam.pdf
├── cases                                                                                               
│   ├── set_all_env.sh                                                                                  
│   └── week1
│       ├── case_1.1_student
│       ├── case_1.2_student
│       ├── case_1.4_student
│       ├── case_20.1_student
│       ├── case_2.1_student
│       └── case_3.2_student
└── zips
    ├── unzip_all.sh
    └── week1
        ├── case_1.1_student.zip
        ├── case_1.2_student.zip
        ├── case_1.4_student.zip
        ├── case_20.1_student.zip
        ├── case_2.1_student.zip
        └── case_3.2_student.zip

And we're done! Keeping things clean and organized should be a breeze now.


Now building on top of what @gabfr has suggested, let's take the automation to the next level.

Grab the script here and place it right in the ds4a folder. Don't forget to chmod it. Now, say we want to work at the Case 1.4. You can simply:

./launch_case.sh ./cases/week1/case_1.4_student

The script will:

  • Search for the correct environment name for this case
  • Activate it
  • Launch the Jupyter Notebook on the specified folder

And when you're done, just terminate the Jupyter Notebook as you normally would. You'll be back to the ds4a folder, no environment deactivation needed as well. :)


I quickly learned that setting up is not the only thing that can (and should!) be automated. Cleaning things up is another step that can be easily automated. After throughly exploring a given week's cases, you can grab this script to help you with that. This script will automatically remove all environments that were set up for the given week. All you have to do is place it inside the cases folder and run it as follows:

./remove_all_env.sh week1

Note that you still need to have the respective week's folder and osx_env.yml file for this to work.

@atgmello
Copy link
Author

atgmello commented Mar 10, 2020

For completeness, I'll add some scripts developed by @gabfr (with minor changes to accommodate my folder structure).
His gists:
https://gist.github.com/gabfr/a592c78e0e5d7aba7117ce229b916fa4#the-opensh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment