Skip to content

Instantly share code, notes, and snippets.

View tuliocasagrande's full-sized avatar

Tulio Casagrande tuliocasagrande

View GitHub Profile

Running a SageMaker Docker Image Locally

Download image locally

  1. (Optional) Grab the image URI using the SageMaker Python SDK:

     In [1]: import sagemaker
     In [2]: sagemaker.image_uris.retrieve("pytorch", "us-east-1", version="1.10", image_scope="inference", instance_type="ml.p3.2xlarge")
    

Out[2]: '763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-inference:1.10-gpu-py38'

Install auto-shutdown on SageMaker Studio domain

This script installs the auto-shutdown Lifecycle Configuration to the SageMaker Domain, such that all users get the extension enabled by default.

The same installation can be made following the blog post Customize Amazon SageMaker Studio using Lifecycle Configurations, but I needed a "1-click" installation to run faster in multiple environments.

Prerequisites

  • Bash (or other compatible Unix shell)
  • AWS CLI
import json
import os
import boto3
CLIENT = boto3.client('sagemaker')
SAGEMAKER_ROLE_ARN = os.environ['SAGEMAKER_ROLE_ARN']
class ResourcePending(Exception):
#!/bin/bash
library=pandas
artifacts_bucket=<MY_ARTIFACTS_BUCKET>
echo "Making sure we have the latest packages"
yum update -y
echo "Installing python3 and pip"
yum install python3 -y # this is currently python3.7
#!/bin/bash
repository_name=awesome-model
image_tag=latest
aws_account_id=$(aws sts get-caller-identity --query Account --output text)
# Get the region defined in the current configuration (default to us-east-1 if none defined)
region=$(aws configure get region)
region=${region:-us-east-1}
@tuliocasagrande
tuliocasagrande / cognito-change-password-challenge.py
Created February 18, 2020 12:57
Responds to the new password challenge on Amazon Cognito
import boto3
def generate_password(length=16):
"""Generate a random alphanumeric password.
More recipes and best practices can be found here:
https://docs.python.org/3/library/secrets.html#recipes-and-best-practices.
Args:
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

YouTube Spam Collection v. 1

The YouTube Spam Collection v. 1 is a public set of YouTube labeled comments that have been collected for spam research. It has five datasets composed by 1,956 real and non-encoded messages that were tagged as legitimate (ham) or spam.

Composition

This corpus has been collected using the YouTube Data API v3.

The samples were extracted from the comments section of 5 videos that were among the 10 most viewed on YouTube during the collection period. The table below lists the 5 datasets collected, the YouTube video ID, the number of samples in each class and the total number of samples per dataset.