Skip to content

Instantly share code, notes, and snippets.

Avatar

Nick Doiron mapmeld

View GitHub Profile
@mapmeld
mapmeld / investments.md
Last active Jan 7, 2021
Stock and ETF which I invested in
View investments.md

Not including how many shares I have or what % of my investments these are

Not including cryptocurrencies

Ticker Info Effective Avg Purchase Price
VTHRX Vanguard retirement ?
VDC Vanguard consumer staples 172.75
XCEM Em econ not China 29.87
HOMZ Housing 32.26
@mapmeld
mapmeld / bb.md
Last active Jan 4, 2021
Bangla Benchmark runs
View bb.md

Code: https://colab.research.google.com/drive/1vltPI81atzRvlALv4eCvEB0KdFoEaCOb?usp=sharing

Can these scores be improved? YES!

Rerunning with more training data, more epochs of training, or using other libraries to set a learning rate / other hyperparameters before training.

  • Experimenting with epochs - when I doubled the number of epochs, MuRIL improves only slightly (69.5->69.7 on one task)

The point of a benchmark is to run these models through a reasonable and identical process; you can tweak hyperparameters on any model to improve results.

@mapmeld
mapmeld / twiml-lightning-share.md
Last active Oct 22, 2020
twiml-lightning-share
View twiml-lightning-share.md
@mapmeld
mapmeld / dv-wave.py
Last active Jul 16, 2020
PythonCode
View dv-wave.py
from simpletransformers.classification import ClassificationModel
# set use_cuda=False on CPU-only platforms
model = ClassificationModel('bert', 'monsoon-nlp/dv-wave', num_labels=8, use_cuda=True, args={
'reprocess_input_data': True,
'use_cached_eval_features': False,
'overwrite_output_dir': True,
'num_train_epochs': 3,
'silent': True
})
@mapmeld
mapmeld / add_to_shapefile.py
Created Jul 5, 2020
Add JSON block data to a shapefile with GDAL
View add_to_shapefile.py
# pip install gdal
import json
from osgeo import ogr
# depends on your shapefile
target_shapefile = 'tl_2010_sample_shapefile.shp'
fips_id = 'GEOID10'
saveblocks = json.loads(open('savefile.json', 'r').read())
@mapmeld
mapmeld / load_acs.py
Last active Jul 8, 2020
Load 5-year ACS race + ethnicity data, ending in 2017
View load_acs.py
# pip install requests
import time, json
import requests
api_key = "API_KEY_STRING"
# look up FIPS for state and county:
# https://www.nrcs.usda.gov/wps/portal/nrcs/detail/national/home/?cid=nrcs143_013697
state = '12'
county_fips = ['086']
@mapmeld
mapmeld / links.md
Last active May 13, 2020
References and links for Spanish counterfactuals
View links.md
View AutoKeras_image_regression.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@mapmeld
mapmeld / yolo.py
Created Apr 27, 2020
Adjusting yolo.py to return raw boxes and classes for images
View yolo.py
# -*- coding: utf-8 -*-
"""
Class definition of YOLO_v3 style detection model on image and video
"""
import colorsys
import os
from timeit import default_timer as timer
import numpy as np
View Baby-Hindi-Model.md

Releasing Hindi ELECTRA model

This is a first attempt at a Hindi language model trained with Google Research's ELECTRA. I don't modify ELECTRA until we get into finetuning, and only then because there's hardcoded train and test files

CoLab: https://colab.research.google.com/drive/1R8TciRSM7BONJRBc9CBZbzOmz39FTLl_

Additional background: https://medium.com/@mapmeld/teaching-hindi-to-electra-b11084baab81

It's available on HuggingFace: https://huggingface.co/monsoon-nlp/hindi-bert - sample usage: https://colab.research.google.com/drive/1mSeeSfVSOT7e-dVhPlmSsQRvpn6xC05w

You can’t perform that action at this time.