Skip to content

Instantly share code, notes, and snippets.

@torbjornvatn
Last active January 10, 2018 08:18
Show Gist options
  • Save torbjornvatn/f2450cb18637a12025667acd6aa8f874 to your computer and use it in GitHub Desktop.
Save torbjornvatn/f2450cb18637a12025667acd6aa8f874 to your computer and use it in GitHub Desktop.
Google DataLab Github Remote

Why?

We here at Unacast are using Google Cloud Datalab quite a bit for data analysis and exploration and think it's a great product.

The Version Control experience is a bit clunky to say the least. Either you'll have to use the bundled Ungit web interface, or you have to ssh and docker your way into the running Docker container to use the git CLI. Either way the you'll have to work against a Google Cloud Source Repository as the remote, while we really want to utilize Github's .ipynb preview functionallity and Pull Request mechanism. Source Repositories do have a Github sync feature, but it only works one way and you have to set it up when you create the repository (which you can't do for datalab repos).

So this is an attempt of setting up some git tricks to make this workflow a bit smoother and setting up Github as the remote for the project.

How?

First of all it's a bit of a pain to docker exec into the container from the Compute Engine instance every time you want to use the git CLI. So I added this alias to my .bashrc file to go straight to the notebooks directory.

alias dl="docker exec -it datalab bash -c 'cd /content/datalab/notebooks && bash'"

Then run dl to open the notebooks folder.

Now it's time to download and run the datalab-github-remote.sh script below inside the notebooks folder. This will ask you for your Github username, personal access token and what repo you'd like to set up as a remote. The username is what comes after github.com/ in the url when you look at your profile on Github. A Personal Access Token can be created here and it needs repo permissions.

Use this command to run the setup script inside the docker container:

bash -c "$(wget -O - https://git.io/vNLh3)"

NB! Always read through scripts that your asked to execute this way to check that they don't do anything malicious.

This adds a few new entries the projects git config that makes Github the only remote, hence removing the Source Repository that was automatically set up when the Datalab instance was created.

You should now be able to use the configured Github repo as your remote both form the Ungit web interface and from the git CLI inside the docker container. 🎉

Whom?

@torbjornvatn

#!/usr/bin/env bash
set -e
reset="\033[0m"
blue="\033[1;36m"
cat << EOM
__ __
____ _ _ _ / / \ \ _____ _ _ _____ _
| \ ___| |_ ___| |___| |_ / / ___ \ \ | __|_| |_| | |_ _| |_
| | | .'| _| .'| | .'| . | < < |___| > > | | | | _| | | | . |
|____/|__,|_| |__,|_|__,|___| \ \ / / |_____|_|_| |__|__|___|___|
\_\ /_/
EOM
echo ""
echo "Hi there! Now we're going to set up a GitHub remote for your DataLab code."
echo "But for that I'll need your Github Username, a Token and the name of the Github Repo that we're syncing with."
echo "---"
echo "Github user (your username):"
read -r gh_user
echo "Github token (Personal Access Token):"
read -r gh_token
echo "Github repo (username/reponame):"
read -r gh_repo
echo ""
echo "First of all we're getting rid of the Source Repository settings with this command"
remove_remote_cmd="git remote remove origin"
echo -e "$blue${remove_remote_cmd}$reset"
${remove_remote_cmd}
if [ ! $(git remote | grep "github") ]; then
echo ""
echo "I will now set the Github as git remote with this command:"
gh_remote_url="https://$gh_user:$gh_token@github.com/$gh_repo"
gh_remote_cmd="git remote add github $gh_remote_url"
echo -e "$blue${gh_remote_cmd}$reset"
res=$(${gh_remote_cmd})
fi
echo ""
echo "Lastly I'll set master to track the new github/master with this command:"
gh_upstream_cmd="git branch --set-upstream-to github/master"
echo -e "$blue${gh_upstream_cmd}$reset"
# Have to fetch from the new remote
git fetch github
${gh_upstream_cmd}
#Add git 2.0 style push
git config --global push.default simple
echo ""
echo "Your new remote is set up like this:"
git remote show github
echo ""
echo "That's it! Enjoy your Datalabing"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment