Skip to content

Instantly share code, notes, and snippets.

View andybega's full-sized avatar

Andreas Beger andybega

View GitHub Profile
@andybega
andybega / event-pull-for-simon.sql
Created October 23, 2013 19:59
Pull events for a given country between 2001 and 2011, for Simon's party project.
# Log in to event database first, e.g.
# mysql -u xxx -p -h IPaddress -D event_data
SET @countryid = (SELECT id FROM countries WHERE countryname='CYPRUS');
# Pull events and related information, takes 1.5m for Cyprus, and save
# to a csv file
CREATE TABLE temp_results AS
SELECT e.event_ID,
e.event_date,
@andybega
andybega / predictReport-worldmap.R
Created November 26, 2013 18:08
Map code for the prediction report
worldMap <- function(x, id, data, date='2008-01-01', legend.title=NULL,
maxy=1) {
# Input 2-column matrix with unique identifier and data to be mapped for
# state slice in "date", output thematic map.
require(cshapes)
require(maptools)
require(RColorBrewer) # for color palettes
require(plyr)
@andybega
andybega / vector-combinations.R
Last active December 31, 2015 04:19
For Scott, combinations of vectors that are survey answers.
# Change to your data
data <- data.frame(resp=1:5,
w11=sample(1:5, 5, replace=TRUE),
w12=sample(1:5, 5, replace=TRUE),
w13=sample(1:5, 5, replace=TRUE))
data$LoySum <- apply(data[, c("w11", "w12", "w13")], 1, mean)
# Names of the variables we want to look at
q.names <- c("w11", "w12", "w13")
@andybega
andybega / explore-global-wits.R
Created December 12, 2013 21:32
Quick look at the Global WITS data Gary sent us.
setwd("/Volumes/political-science/shared/ICEWS Project/C-IED/Data/Global_WITS")
# Read as txt chunk
# The first 16 lines are meta data
wits.text <- readLines("WITS.csv")
head(wits.text, n=16)
# Get actual data
wits <- read.csv(text=wits.text, header=TRUE, skip=16)
@andybega
andybega / gt-post-correlations.R
Created December 30, 2013 12:24
Get CIDNE/GDELT/ICEWS series in the Ground Truth comparison blog post in order to calculate correlations. This is based on Ben's R code for the plots. There are some small differences because I am subsetting the events in SQL and using the data fields, not with grep of actual event type values.
## Correlations for blog post
library(plyr)
# from GT
# source directories and first few lines of pull-data.R
sql <-
"SELECT dateoccured AS date, count(*) AS sigacts
FROM sigacts
@andybega
andybega / turkey-daily-protests.R
Last active May 9, 2016 14:13
GDELT and ICEWS counts of daily protest events in Turkey from 15 May to 15 June 2013.
# Libraries
library(RMySQL)
# Connect to server
conn <- dbConnect(MySQL(), user="ab428", password="",
dbname="event_data", host="")
country <- "Turkey"
start.date <- "2013-05-15"
end.date <- "2013-06-15"
# Dot density map
# Eventually should be like http://www.radicalcartography.net/index.html?frenchkisses
library(maptools)
nc_SP <- readShapePoly(system.file("shapes/sids.shp", package="maptools")[1],
proj4string=CRS("+proj=longlat +ellps=clrk66"))
## Not run:
pls <- slot(nc_SP, "polygons")
@andybega
andybega / pg-upgrade.sh
Created December 17, 2013 01:02
Updating PostgreSQL and PostGIS (installed via homebrew)
# from here http://blog.55minutes.com/2013/09/postgresql-93-brew-upgrade/
# Create backup of database
/usr/local/cellar/postgresql/9.2.4/bin/pg_dump -h localhost -p 5432 -U ab428 -Fc -b -v -f "/usr/local/var/pg_backup/afghanistan.backup" afghanistan
# Upgrade PostGIS
brew doctor # fix errors
brew update
brew upgrade postgres
brew upgrade gdal --with-postgres
@andybega
andybega / example.R
Created January 11, 2019 09:51
Reprex for win-builder R devel ggplot2-related error
library("ggplot2")
df <- structure(list(date = structure(c(17532, 17897, 18262, 18993,
19358, 19723, 20089), class = "Date"), id = c("9991", "9991",
"9991", "9992", "9992", "9992", "9992"), cowcode = c(999, 999,
999, 999, 999, 999, 999), y = c(1, 1, 1, 2, 2, 2, 2)), row.names = c(NA,
-7L), class = "data.frame")
ggplot(df, aes(x = date, y = y, group = cowcode)) + geom_line()
@andybega
andybega / chart-model.R
Created January 11, 2019 12:36
RCT-A subset machine/human relative performance by question group
library("tidyverse")
arima_qs <- readr::read_csv("/path/to/forecast_rct_a_subset.csv",
col_types = cols(user_id = col_integer())) %>%
mutate(Forecaster = factor(Forecaster))
# Add # of questions to each Data source label
arima_qs <- arima_qs %>%
group_by(Data_source) %>%
mutate(by_Data_source_n_ifps = length(unique(ifp_id))) %>%