Armand McQueen armandmcqueen

## notebook_persistence.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                armandmcqueen
                / notebook_persistence.md
            
            
              Last active
              April 27, 2021 00:16
            
          
    Bind Mounts in Notebook Persistence

We want to store the state of a notebook in NFS so that if the container dies, all of the vital state is persistent and minimal work is lost.
There have been a couple of different proposals for how EFS should be laid out and how bind mounts should work. It's gotten confusing for me, so I'm summarizing them here to see if I understand them correctly.
Note on terminology - Notebook with a capital 'N' refers to the Determined concept of a Notebook which is an instance of JupyterLab. notebook with a lowercase 'n' refers to a Jupyter notebook (a .ipynb).
In all cases, there will be a directory on EFS called /shared-data that will be bind-mounted to the jupyterlab container at /shared-data. It can be used to share datasets (or anything else) between every Notebook that runs in Determined.

  
## master_with_startup_script.yaml
checkpoint_storage:
  type: s3
  bucket: ******
  save_experiment_best: 0
  save_trial_best: 1
  save_trial_latest: 1

db:
  user: postgres
  password: "********"

## network_util.py
import requests
import urllib
import socket
import requests.adapters
import logging
import http.client
import argparse


## code_samples.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                armandmcqueen
                / code_samples.md
            
            
              Last active
              January 13, 2020 18:11
            
              
                Code sample, detectron2 save model output
              
          
    How to save model artifacts

export QUILT_HASH=3722a498
export DOCKER_HASH=sha256:8a4f4123c92a7fe2e8ca4c404094ab95dc1fb868ad077d2e084ba4082a5a29c1
export GIT_HASH=0a7a9d10

cd /detectron2/output
python

  
## Code samples for pytorch blog.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                armandmcqueen
                / Code samples for pytorch blog.md
            
            
              Created
              November 15, 2019 20:36
            
          
    Code Samples

Acquire a benchmarking dataset

COCO 2017 is a image benchmarking dataset and GLUE is a collection of natural language processing datasets.
$ quilt install quilt-ml-data/glue --to /datasets/glue
Downloading......
The package "quilt-ml-data/glue" was successfully downloaded to /datasets/glue

  
## s3_debugger.py
import boto3
import json
import os


"""
1. Check current credentials and policies
2. Check if we can write, get, delete a test object in this bucet
3. Check for Bucket/object restrictions

## code_samples.md

      
              1 file
            
          
              1 fork
            
          
              0 comments
            
          
              0 stars
            
          
                armandmcqueen
                / code_samples.md
            
            
              Last active
              November 1, 2019 02:04
            
              
                [WIP] New API usecases and associated code samples
              
          
    Quilt Package Usecases

Introduction

This document outlines several key use cases and proposes code samples to satisfy those usecases. Both usecases and code samples are open to discussion.
The document is about UX, not implementation. It does not go much into performance needs, but the expectation is that the implementation will be written such that any API we expose is as high-performance as possible.
API names are rough and will need to be improved.

  
## code_samples.py
# Generate a package

pkg_builder = PackageBuilder()
pkg_builder = pkg_builder.set('KEY', '/path/to/file')
pkg = pkg_builder.build() # Physical keys point to the same location they always have

# Update a package

pkg.set("logical_key", "physical_key") # This generates the hash for the object

## cfn_deploy_starting_code.py
import boto3
import time

# Manually entered params
TEMPLATE_URL = "https://quilt-marketplace.s3.amazonaws.com/releases/test/quilt3.0/c714ae8/default.yaml"
STACK_NAME = "staging-canary"
AWS_PROFILE = "staging"
REGION = "us-east-1"


## crash.log
019-03-08 18:46:29.848043: I tensorflow/stream_executor/platform/default/dso_loader.cc:161] successfully opened CUDA library libcublas.so.10.0 locally
2019-03-08 18:46:29.854419: I tensorflow/stream_executor/platform/default/dso_loader.cc:161] successfully opened CUDA library libcublas.so.10.0 locally
2019-03-08 18:46:29.988655: I tensorflow/stream_executor/platform/default/dso_loader.cc:161] successfully opened CUDA library libcublas.so.10.0 locally
2019-03-08 18:46:30.011434: I tensorflow/stream_executor/platform/default/dso_loader.cc:161] successfully opened CUDA library libcublas.so.10.0 locally
2019-03-08 18:46:30.159645: I tensorflow/stream_executor/platform/default/dso_loader.cc:161] successfully opened CUDA library libcudnn.so.7 locally
2019-03-08 18:46:30.162392: I tensorflow/stream_executor/platform/default/dso_loader.cc:161] successfully opened CUDA library libcublas.so.10.0 locally
2019-03-08 18:46:30.256675: I tensorflow/stream_executor/platform/default/dso_loader.cc:161] successfully opened CUDA
	checkpoint_storage:
	type: s3
	bucket: ******
	save_experiment_best: 0
	save_trial_best: 1
	save_trial_latest: 1

	db:
	user: postgres
	password: "********"
	import requests
	import urllib
	import socket
	import requests.adapters
	import logging
	import http.client
	import argparse
	import boto3
	import json
	import os



	"""
	1. Check current credentials and policies
	2. Check if we can write, get, delete a test object in this bucet
	3. Check for Bucket/object restrictions
	# Generate a package

	pkg_builder = PackageBuilder()
	pkg_builder = pkg_builder.set('KEY', '/path/to/file')
	pkg = pkg_builder.build() # Physical keys point to the same location they always have

	# Update a package

	pkg.set("logical_key", "physical_key") # This generates the hash for the object
	import boto3
	import time

	# Manually entered params
	TEMPLATE_URL = "https://quilt-marketplace.s3.amazonaws.com/releases/test/quilt3.0/c714ae8/default.yaml"
	STACK_NAME = "staging-canary"
	AWS_PROFILE = "staging"
	REGION = "us-east-1"
	019-03-08 18:46:29.848043: I tensorflow/stream_executor/platform/default/dso_loader.cc:161] successfully opened CUDA library libcublas.so.10.0 locally
	2019-03-08 18:46:29.854419: I tensorflow/stream_executor/platform/default/dso_loader.cc:161] successfully opened CUDA library libcublas.so.10.0 locally
	2019-03-08 18:46:29.988655: I tensorflow/stream_executor/platform/default/dso_loader.cc:161] successfully opened CUDA library libcublas.so.10.0 locally
	2019-03-08 18:46:30.011434: I tensorflow/stream_executor/platform/default/dso_loader.cc:161] successfully opened CUDA library libcublas.so.10.0 locally
	2019-03-08 18:46:30.159645: I tensorflow/stream_executor/platform/default/dso_loader.cc:161] successfully opened CUDA library libcudnn.so.7 locally
	2019-03-08 18:46:30.162392: I tensorflow/stream_executor/platform/default/dso_loader.cc:161] successfully opened CUDA library libcublas.so.10.0 locally
	2019-03-08 18:46:30.256675: I tensorflow/stream_executor/platform/default/dso_loader.cc:161] successfully opened CUDA