Skip to content

Instantly share code, notes, and snippets.

@l1m2p3
l1m2p3 / most_frequent_words.py
Last active January 19, 2018 06:17
This script helps uploads the most frequent words to Dynamo. It requires the "put_words" function from https://gist.github.com/ShawnLMP/fe2e355d5af19e17e5a21bcf356b3d45, as well as this data set: https://www.kaggle.com/rtatman/english-word-frequency/data
import csv
import sys
from dynamo_access import put_words, get_words
def get_frequent_words(dataset , numOfWords):
pairs = []
with open(dataset, 'rb') as csvfile:
reader = csv.reader(csvfile, delimiter=',')
next(reader, None)
for row in reader:
@l1m2p3
l1m2p3 / SimpleModel.py
Created January 9, 2018 08:00
SimpleModel definition
import torch
from torch.autograd import Variable
import torch.nn as nn
import torch.nn.functional as F
words_dim = 300
input_channel = 1
output_channel = 100
dropout_rate = 0.5
@l1m2p3
l1m2p3 / invoke.py
Created January 7, 2018 20:26
Sample script to invoke Lambda function
import boto3
import json
from datetime import datetime
from pytz import timezone
lambda_client = boto3.client('lambda')
print('invoking')
@l1m2p3
l1m2p3 / update.py
Created January 7, 2018 20:22
Sample script to reupload new deployment package to S3 and update Lambda to use the new package
import os
import boto3
import json
import base64
from datetime import datetime
from pytz import timezone
import sys
lambda_client = boto3.client('lambda')
@l1m2p3
l1m2p3 / handler.py
Last active January 7, 2018 20:08
AWS Lambda handler file. Modify the code as you need
import boto3
import torch
from torch.autograd import Variable
import torch.nn as nn
import torch.nn.functional as F
import numpy as np
import json
# change this to use other models
from SimpleModel import SimpleModel
@l1m2p3
l1m2p3 / create_package.sh
Last active February 20, 2018 08:11
commands to create deployment package to run PyTorch on Lambda
sudo yum -y update
sudo yum install -y gcc zlib zlib-devel openssl openssl-devel git make automake gcc-c++ kernel-devel
# cmake 3.6.2
cd
wget https://cmake.org/files/v3.6/cmake-3.6.2.tar.gz
tar -zxvf cmake-3.6.2.tar.gz
cd cmake-3.6.2
sudo ./bootstrap --prefix=/usr/local
sudo make
@l1m2p3
l1m2p3 / dynamo_access.py
Last active July 31, 2018 10:04
functions for updating/accessing word vecs on DynamoDB (*updated to use spacy to find token. See https://spacy.io/usage/ for how to install spacy)
import boto3
import numpy
import pickle
import spacy
table_name = 'wordvec' # table name on DynamoDB
# batch size specified by DynamoDB. See DynamoDB's doc for more details
write_batch_size = 25
read_batch_size = 100
import torch
import numpy
import pickle
# change this to your own files' name
dataset_file = 'word2vec.sst-1.pt' # file to load from
wordindex_file = 'wordindex.pkl' # file to save to
indexvec_file = 'indexvec.npy' # file to save to