Skip to content

Instantly share code, notes, and snippets.

View morganmcg1's full-sized avatar
💭
Trying to ML

Morgan McGuire morganmcg1

💭
Trying to ML
View GitHub Profile
@morganmcg1
morganmcg1 / convert_corpora.py
Created April 27, 2021 10:52
Download and convert corpora to .spacy for the spaCy GoEmotions tutorial
# This will download the config file and corpora needed for the spaCy GoEmotions tutorial:
# https://github.com/explosion/projects/blob/v3/tutorials/textcat_goemotions
# Get CNN Config
os.makedirs(os.path.join(spacy_dir/'training', 'cnn'), exist_ok=True)
cnn_cfg_url = "https://raw.githubusercontent.com/explosion/projects/v3/tutorials/textcat_goemotions/configs/cnn.cfg"
cnn_cfg = spacy_dir/'cnn.cfg'
!wget -q -O $cnn_cfg $cnn_cfg_url
/srv/conda/envs/saturn/lib/python3.7/site-packages/torch/nn/functional.py:1204: UserWarning: Output 0 of BackwardHookFunctionBackward is a view and is being modified inplace. This view was created inside a custom Function (or because an input was returned as-is) and the autograd logic to handle view+inplace would override the custom backward associated with the custom Function, leading to incorrect gradients. This behavior is deprecated and will be forbidden starting version 1.6. You can remove this warning by cloning the output of the custom Function. (Triggered internally at /pytorch/torch/csrc/autograd/variable.cpp:547.)
result = torch.relu_(input)
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-7-8393222d813a> in <module>
----> 1 simple_train_single(**model_params)
<ipython-input-6-29dd15a1ccdf> in simple_train_single(bucket, prefix, batch_size, downsample_to, n_epochs, base_lr, pretr
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
/srv/conda/envs/saturn/lib/python3.7/site-packages/wandb/sdk/wandb_init.py in init()
743 try:
--> 744 run = wi.init()
745 except_exit = wi.settings._except_exit
/srv/conda/envs/saturn/lib/python3.7/site-packages/wandb/sdk/wandb_init.py in init()
420 backend = Backend(settings=s)
--> 421 backend.ensure_launched()
@morganmcg1
morganmcg1 / cluster_dataloader_error.py
Created May 5, 2021 15:56
cluster_dataloader_error
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
<ipython-input-17-d566959a97b0> in <module>
1 # If one or more worker jobs errors, this will describe the issue
----> 2 futures[0].result()
/srv/conda/envs/saturn/lib/python3.7/site-packages/distributed/client.py in result(self, timeout)
223 if self.status == "error":
224 typ, exc, tb = result
--> 225 raise exc.with_traceback(tb)
import boto3, os
my_bucket = "prosecraft-language-models"
folder_name = "manuscript-samples"
for f in files:
boto3.Session().resource('s3').Bucket(my_bucket).Object(
os.path.join(folder_name, f"{f.split('/')[-2]}_{f.split('/')[-1]}")).upload_file(f)
import boto3
import os
import pickle
my_bucket = "prosecraft-manuscript-archives"
my_file = "manuscript-samples-2021-05-06.zip"
s3 = boto3.resource('s3')
obj = s3.Object(my_bucket, my_file)
body = obj.get()['Body'].read()
@morganmcg1
morganmcg1 / create_tfrecords_prosecraft.py
Last active July 26, 2021 10:48
Modified create_tfrecords from GPT-Neo repo
import argparse
import os
from pathlib import Path
import ftfy
import tensorflow as tf
from lm_dataformat import Reader
from tokenizers import Tokenizer
from transformers import GPT2TokenizerFast
from tqdm import tqdm
@morganmcg1
morganmcg1 / deepchem_wandb.py
Last active July 28, 2021 12:13
DeepChem W&B Minimal Examples
#!/usr/bin/env python
"""Test Optuna integration
---
id: 0.0.4
check-ext-wandb: {}
assert:
- :wandb:runs_len: 1
- :wandb:runs[0][project]: integrations_testing
- :wandb:runs[0][config][a]: 2
- :wandb:runs[0][config][b]: testing
@morganmcg1
morganmcg1 / gist:0e4344df49fe3b43243505992ce998d5
Last active August 9, 2021 13:34
gpt-j generation error stacktrace
2021-08-09 13:33:24.972717: E external/org_tensorflow/tensorflow/compiler/xla/pjrt/pjrt_stream_executor_client.cc:1981] Execution of replica 2 failed: Resource exhausted: Attempting to reserve 4.44G at the bottom of memory. That was not possible. There are 9.62G free, 0B reserved, and 2.65G reservable.
2021-08-09 13:33:24.972816: E external/org_tensorflow/tensorflow/compiler/xla/pjrt/pjrt_stream_executor_client.cc:1981] Execution of replica 5 failed: Resource exhausted: Attempting to reserve 4.44G at the bottom of memory. That was not possible. There are 9.62G free, 0B reserved, and 2.65G reservable.
2021-08-09 13:33:24.972875: E external/org_tensorflow/tensorflow/compiler/xla/pjrt/pjrt_stream_executor_client.cc:1981] Execution of replica 3 failed: Resource exhausted: Attempting to reserve 4.44G at the bottom of memory. That was not possible. There are 9.62G free, 0B reserved, and 2.65G reservable.
2021-08-09 13:33:24.972954: E external/org_tensorflow/tensorflow/compiler/xla/pjrt/pjrt_stream_executor_client.c
@morganmcg1
morganmcg1 / artifacts_test.ipynb
Created August 26, 2021 10:05
artifacts download fails
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.