Skip to content

Instantly share code, notes, and snippets.

@rimijoker
Last active November 17, 2020 04:03
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rimijoker/115d2dbb8e9788df851c8073b38c216b to your computer and use it in GitHub Desktop.
Save rimijoker/115d2dbb8e9788df851c8073b38c216b to your computer and use it in GitHub Desktop.
Google Summer of Code 2020(OpenMined) Final Work Product

Google Summer of Code 2020 Final Work Product -

Student Anshuman Singh
Github @rimijoker
Organisation OpenMined
Project Implement Auto-Scaling of PyGrid servers on Google Cloud

Abstract

The audience of PySyft mainly consists of people who would like to train their model on private data that reside on other devices/locations. Right now one has to manually spin-up Google Cloud Machines, load a PyGrid instance, queue and run training jobs deposit the results to another long-running instance(Master) and tear-down the created instances(Workers). This project aims to automate the process mentioned above. Another primary the feature would be to run "hyperparameter sweep" to figure out the best parameters for the model. Basic functionalities include:

  • Provision Machines on the Cloud(Using ​ Terraform)
  • Install and start PyGrid Servers(Using docker)
  • Run training scripts(Using PySyft, PyTorch)
  • Adding the functionality effectively runs a hyper-parameter sweep.
  • Tear-down, the worker instances the training, is done.

Provision Machines on the Cloud(Using ​ Terraform) Install and start PyGrid Servers(Using docker) Run training scripts(Using PySyft, PyTorch) Setting up a useful cross-validation framework and adding the functionality effectively runs a hyper-parameter sweep. So, in other words, one should be easily able to create any number of PyGrid servers, train on a model on them, select the best hyperparameters, send the models to the master, take the average and the workers PyGrid nodes should tear-down or keep running(as specified) automatically.

Pull-Requests (Opened & Merged):

Documentation:

Usage

Set Up Budget Alerts (important)

Before you start to spin-up instances, we encourage to set a budget alert on GCP to avoid surprise costs.

Setup Budget and Budget Alerts

Spin-up Instances using the following commands:-

You can find sample code in test.py and test.ipynb

  • Import enums from the gcloud_configurations.py
import syft.grid.autoscale.utils.gcloud_configurations as configs
`"

- Initialize using:

```python

instance_name = gcloud.GoogleCloud(
	credentials="GCP Login/terraf.json",
	project_id="terraform",
	region=configs.Region.us_central1,
)
  • Reserve IP address using:
instance_name.reserve_ip("grid")
  • Create Instances using:
instance_name.compute_instance(
	name="new-12345",
	machine_type=configs.MachineType.f1_micro,
	zone=configs.Zone.us_central1_a,
	image_family=configs.ImageFamily.ubuntu_2004_lts,
)
  • Create PyGrid Network instance using:
instance_name.create_gridnetwork(
	name="new-network",
	machine_type=configs.MachineType.f1_micro,
	zone=configs.Zone.us_central1_a,
)
  • Create PyGrid Node instance using:
instance_name.create_gridnode(
	name="new-node",
	machine_type=configs.MachineType.f1_micro,
	zone=configs.Zone.us_central1_a,
	gridnetwork_name="new-network",
)
  • Create Clusters using:
c1 = instance_name.create_cluster(
	name="my-cluster1",
	machine_type=configs.MachineType.f1_micro,
	zone=configs.Zone.us_central1_a,
	reserve_ip_name="grid"
	target_size=3,
	eviction_policy="delete",
)
  • Run a parameter sweep to figure out the best parameters using:
c1.sweep(
    	model,
    	hook,
    	model_id="new_model",
    	mpc=False,
    	allow_download=False,
    	allow_remote_inference=False,
    	apply=True,
	)
  • Destroy all the created resources using:
instance_name.destroy()

You can find more details on this here

Work left

  • Test out various libraries and add the functionality of hyperparameter search to pick the best model.

Future Work

  • Add support for other cloud platforms and contribute more to PyGrid and PySyft.

Other

  • Answered doubt of students interested in doing GSoC next year on my twitter handle here.

Feel free to reach out to me if you have any doubts about my projects or GSoC. You can find me on twitter @rimijoker

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment