Skip to content

Instantly share code, notes, and snippets.


Dr Q quinsulon

  • Philadelphia, PA and DC
View GitHub Profile
thomwolf /
Last active Nov 18, 2021
Load full English Wikipedia dataset in HuggingFace nlp library
import os; import psutil; import timeit
from datasets import load_dataset
mem_before = psutil.Process(os.getpid()).memory_info().rss >> 20
wiki = load_dataset("wikipedia", "20200501.en", split='train')
mem_after = psutil.Process(os.getpid()).memory_info().rss >> 20
print(f"RAM memory used: {(mem_after - mem_before)} MB")
s = """batch_size = 1000
for i in range(0, len(wiki), batch_size):
import math
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.nn import TransformerEncoder, TransformerEncoderLayer
class TransformerModel(nn.Module):
def __init__(self, ntoken, ninp, nhead, nhid, nlayers, dropout=0.5):
View smallberta_pretraining.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View Sqlite Data Frame.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
import java.util.SortedMap;
import java.util.TreeMap;
public class ConsistentHashing {
// Consistent Hashing with Ring having 50 buckets.
final static int LIMIT = 50;
// Sorted Map.
final static SortedMap<Integer, String> bucketIdToServer = new TreeMap<>();
kaustubhn /
Created Sep 20, 2017
Chinese Restaurant Process
# Implementation of a chinese restaurant process function for a given list of word vectors.
def crp(vecs):
clusterVec = [[0.0] * 25] # tracks sum of vectors in a cluster
clusterIdx = [[]] # array of index arrays. e.g. [[1, 3, 5], [2, 4, 6]]
ncluster = 0
# probablity to create a new table if new customer
# is not strongly "similar" to any existing table
pnew = 1.0/ (1 + ncluster)
N = len(vecs)
rands = [random.random() for x in range(N)] # N rand variables sampled from U(0, 1)
View datagovmetadata.json
{"help": "", "success": true, "result": {"count": 48, "sort": "views_recent desc", "facets": {}, "results": [{"license_title": "License not specified", "maintainer": "New Media", "relationships_as_object": [], "private": false, "maintainer_email": "", "num_tags": 5, "id": "59694770-b6b6-4ae0-a4b9-4ae69c0be2f6", "metadata_created": "2016-07-02T10:06:26.199575", "metadata_modified": "2016-07-02T10:06:26.199575", "author": null, "author_email": null, "state": "active", "version": null, "creator_user_id": "47303a9e-1187-4290-85a3-1fc02dc49e4a", "type": "dataset", "resources": [{"cache_last_updated": null, "package_id": "59694770-b6b6-4ae0-a4b9-4ae69c0be2f6", "webstore_last_updated": null, "id": "3a8a0ad1-19e7-4153-bb2f-d70cf88aaaf8", "size": null, "state": "active", "hash": "", "description": "", "format": "CSV", "tracking_summary": {"total": 32, "recent": 1}, "last_modified": null, "url_type": null, "no_real_name": "True",
View playercorefactory.xml
<?xml version="1.0" encoding="UTF-8"?>
# Goes inside %APPDATA%\kodi\userdata
<player name="VLC" type="ExternalPlayer" audio="false" video="true">
nolanlawson / protips.js
Last active Nov 16, 2021
Promise protips - stuff I wish I had known when I started with Promises
View protips.js
// Promise.all is good for executing many promises at once
// Promise.resolve is good for wrapping synchronous code
Promise.resolve().then(function () {
if (somethingIsNotRight()) {
throw new Error("I will be rejected asynchronously!");
TSiege / The Technical Interview Cheat
Last active Nov 27, 2021
This is my technical interview cheat sheet. Feel free to fork it or do whatever you want with it. PLEASE let me know if there are any errors or if anything crucial is missing. I will add more links soon.
View The Technical Interview Cheat


I have moved this over to the Tech Interview Cheat Sheet Repo and has been expanded and even has code challenges you can run and practice against!