Skip to content

Instantly share code, notes, and snippets.

@soodoku
soodoku / not_normal.ipynb
Last active July 16, 2023 04:58
Not Normal
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@soodoku
soodoku / flask_app.py
Last active July 16, 2023 17:36
Remote Logging of Errors With User Approval
from flask import Flask, request
import uuid
app = Flask(__name__)
@app.route('/error_endpoint', methods=['POST'])
def receive_error_message():
error_message = request.form.get('error_message')
if error_message:
@soodoku
soodoku / group_interest.R
Last active July 9, 2023 02:09
Group Interest Partisans?
# Pareto Party
# Load libs
library(tidyverse)
library(readstata13)
library(car)
library(dplyr)
library(xtable)
library(dplyr)
@soodoku
soodoku / forest_lasso.ipynb
Last active July 4, 2023 05:45
Post Process RF Using Lasso
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@soodoku
soodoku / read_parsed_dmoz.py
Created February 18, 2021 20:25
Reading in the parsed DMOZ file
import csv
import pandas as pd
import numpy as np
df = pd.read_csv('parsed-new.csv', header = None, delimiter="\t", quoting=csv.QUOTE_NONE, encoding='utf-8')
df.head()
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@soodoku
soodoku / get_unique_domain_names_from_comscore.py
Created February 12, 2018 02:15
Get a list of unique domain names from comScore browsing data
#
# Get All Unique Domain Names from comScore
#
# INPUT: comScore browsing data file
#
# OUTPUT: a text file containing a list of unique domains
#
# PAREMETERS:
# + INTERNET_USAGE_FILE: path to the comScore browsing data
# + FINAL_OUTPUT_FILE: path to intended output file
@soodoku
soodoku / county_dma_2016.R
Created November 21, 2017 19:11
DMA to County for 2016
library(readr)
library(dplyr)
a_string <- read_file("nielsen_2016")
split_lines <- strsplit(a_string, "\r\n")[[1]]
split_cols <- strsplit(split_lines, "--")
dat_frame <- ldply(split_cols)
names(dat_frame) <- c("dma", "counties")
write.csv(dat_frame, file = "dma_counties_2016.csv", row.names = F)
State Position Name @Twitter Handle Part Affiliation Boris Shor's Score
AZ Senator David Bradley @Bradley4AZ Democrat -1.253
AZ Senator Katie Hobbs @katiehobbs Democrat -1.684
AZ Senator Ed Ableser @SenatorAbleser Democrat -1.606
AZ Senator Barbara McGuire @SenBarbMcGuire Democrat -0.672
AZ Senator Steve Farley @SteveFarleyAZ Democrat -1.413
AZ Senator Adam Driggs @AdamDriggs Republican 0.738
AZ Senator Bob Worsley @bob_worsley Republican 0.46
AZ Senator Kelli Ward @kelliwardaz Republican 1.144
AZ Senator Nancy Barto @NancyBarto Republican 0.996
# Output = http://gbytes.gsood.com/2013/11/02/the-fairest-of-them-all/
# Uses cces_recode.R here: https://github.com/soodoku/in-n-out/scripts/cces_recode.R
# Plotting the fairest of all media
library(lattice)
png("fairmedia.png")
dotplot(t(t(table(droplevels(cces06$fairmedia[cces06$fairmedia != "Don't know"]), cces06$pid3[cces06$fairmedia != "Don't know"]))/colSums(table(droplevels(cces06$v2112[cces06$fairmedia != "Don't know"]), cces06$pid3[cces06$fairmedia != "Don't know"]))),
main = "Which network do you think provides the \n fairest coverage of national news?",