- [topic modeling] Reuters: The Echo Chamber
- [?] Reuters: The Child Exchange
- [classify and count] AJC: Doctors and Sex Abuse
- [search and count] LATimes: LAPD Crime Misclassification
- [search and show] KQUED: 10 Emails That Detail PG&E’s Cozy Relationship With Regulators
- [phonetics and count] [The GOP Can’t Depend On Minority Candidates To Win Minority Votes](https://fivethir
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import os | |
import time | |
import arrow | |
import requests | |
API = "https://projects.fivethirtyeight.com/election-night-api/tests/default/projected_winners.json" | |
NUM_TIMES = 4 | |
TIME_TO_SLEEP = int(60/NUM_TIMES) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import io | |
import us | |
import csv | |
import itertools | |
import requests | |
import pandas as pd | |
import numpy as np | |
def pairwise(iterable): | |
"s -> (s0,s1), (s1,s2), (s2, s3), ..." |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"id": 3120, | |
"name": "Kirk Cousins", | |
"source": { | |
"player_id": 403308 | |
}, | |
"pointer": null, | |
"story_type": "PlayerCard", | |
"project": "nfl", | |
"data": { |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import os | |
import csv | |
import tqdm | |
import hashlib | |
import itertools | |
def load(filepaths, line_limit=None): | |
data = [] | |
for i, filepath in enumerate(filepaths): | |
print(f'[{i+1} of {len(filepaths)}] loading {filepath}') |
We can make this file beautiful and searchable if this error is corrected: Unclosed quoted field in line 2.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
office_id,office_type,state_code,latest | |
AK-G,G,AK,"Bill Walker, an independent, has been the governor of Alaska since 2014. This seat is up for election this year. | |
Walker is running for reelection. | |
Mark Begich is running for the Democratic nomination. | |
Ten Republicans are competing in the primary: Darin A. Colbry, Mike Dunleavy, Thomas Gordon, Scott Hawkins, Gerald Heikes, Merica Hlatcu, Michael D. Sheldon, Mead Treadwell, Mike Chenault and Jacob Kern. | |
The Alaska gubernatorial primaries take place Aug. 21. |
We can't make this file beautiful and searchable because it's too large.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Year,State,Agency Type,Agency name,q01,q02,q03,q04,Population2 | |
2010,ALABAMA,Cities,Decatur,0,0,1,2,57153 | |
2010,ALABAMA,Cities,Hoover,0,1,0,1,74687 | |
2010,ALABAMA,Cities,Jasper,0,1,0,0,14138 | |
2010,ALABAMA,Cities,Lanett,0,0,0,1,7026 | |
2010,ALABAMA,Cities,Madison,0,0,1,0,41426 | |
2010,ALABAMA,Cities,Oneonta,0,1,0,0,7317 | |
2010,ALABAMA,Cities,Riverside,0,0,0,1,2146 | |
2010,ALABAMA,Cities,Slocomb,1,0,0,0,2065 | |
2010,ALABAMA,Cities,Talladega,0,1,0,0,16826 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
license: gpl-3.0 | |
height: 940 | |
border: no |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# encoding: utf-8 | |
import io | |
import torch | |
import json | |
import numpy as np | |
vocab = torch.load('data/vocab.pt') | |
np.savez_compressed(file='data/embedding.npz', embedding=vocab.embed) | |
with open('data/word2id.json', 'w') as f: |
This file has been truncated, but you can view the full file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{"unsupportable": 128318, "Villalon": 118160, "nunnery": 97674, "Plasticine": 144240, "woods": 5729, "clotted": 66309, "spiders": 9036, "Nampo": 181451, "Nampa": 92771, "woody": 51770, "trawling": 24655, "comically": 48133, "spidery": 91990, "canes": 41343, "beautyheaven": 143561, "Fulde": 115294, "Archuleta": 82341, "Kaboul": 24450, "Journey": 22519, "caned": 69502, "Eure": 104348, "Gravesend": 24420, "rumbustious": 177271, "Retreat": 30939, "Euro": 4009, "Valli": 54480, "naturopathic": 117376, "Valle": 13969, "grenadiers": 181508, "pigment": 21858, "Mizell": 146931, "Alya": 89149, "Morten": 34955, "bringing": 2181, "Valls": 18920, "wooded": 12650, "Al-Sidra": 179543, "grueling": 19942, "pipers": 99625, "wooden": 4192, "wholemeal": 35673, "Saco": 85256, "Miers": 96666, "wednesday": 513, "Sack": 49830, "viable": 11637, "LOVEFiLM": 141680, "chameleons": 48272, "Miera": 100719, "Malenchenko": 96215, "Safarova": 54544, "bbqs": 114055, "Pinkerton": 176544, "pizzle": 165680, "snuggles": 50483, "snuggler": 175379, |