shell.nix for Training in GCP AI Platform
This gist contains a
shell.nix file that can be used to create a Python environment for running training jobs in the GCP AI Platform.
This is specifically for the following tutorial:
This uses code from https://github.com/GoogleCloudPlatform/cloudml-samples in the
shell.nix file should be able to be modified to work for almost training job.
How To Use
nix-shell to get into the shell.
Then, install all required python Packages:
$ pip install -r requirements.txt
If you add additional Python packages to the
buildInputs line in the
shell.nix file, you should be able to use these system-level packages, instead of having to download them.
Now you should be able to actually run training:
$ python3 -m trainer.task --job-dir local-training-output
The tutorial linked above recommends using the following command, however, this fails when using
gcloud from Nixpkgs :-\
$ gcloud ai-platform local train --package-path trainer --module-name trainer.task --job-dir local-training-output
As long as training directly with
python works, you can launch a training job on the GCP AI Platform.
First, you need to login with
gcloud using OAuth:
$ gcloud auth login
Set your default project name so you don't have to specify it in each command below:
$ gcloud config set project inner-melody-274800
Next, you need to create a bucket to store the trained models:
$ BUCKET_NAME="my-training-example-task-3" $ REGION="us-central1" $ gsutil mb -l $REGION gs://$BUCKET_NAME
Finally, actually launch the training job:
$ gcloud ai-platform jobs submit training "my_first_keras_job" --package-path trainer/ --module-name trainer.task --region $REGION --python-version 3.7 --runtime-version 1.15 --job-dir "gs://$BUCKET_NAME/keras-job-dir" --stream-logs