Skip to content

Instantly share code, notes, and snippets.

Andy Halterman ahalterman

Block or report user

Report or block ahalterman

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
@ahalterman
ahalterman / spacy_events.py
Created Mar 13, 2018
Event Data in 30 Lines of Python
View spacy_events.py
import spacy
nlp = spacy.load("en_core_web_lg")
with open("scraped.json", "r") as f:
news = json.load(f)
news = [i['body'] for i in news]
processed_docs = list(nlp.pipe(news))
verb_list = ["launch", "begin", "initiate", "start"]
dobj_list = ["attack", "offensive", "operation", "assault"]
@ahalterman
ahalterman / event_model_snippet.py
Created Mar 4, 2018
Managing machine learning experiments
View event_model_snippet.py
# many lines omitted above
def make_log(experiment_dir, X_train, X_test, Y_test, model, hist, custom_model):
now = datetime.datetime.now()
now = now.strftime("%Y-%m-%d %H:%M:%S")
# get last commit hash
commit = subprocess.check_output(['git', 'rev-parse', 'HEAD']).strip()
# get precision and recall at a range of cutpoints
cutoffs = [0.01, 0.05, 0.10, 0.20, 0.30, 0.40, 0.50, 0.60]
precrecs = [precision_recall(X_test, Y_test, model, i) for i in cutoffs]
@ahalterman
ahalterman / dw_scraper.py
Created Apr 30, 2017
DW scraper for event data tutorial
View dw_scraper.py
from __future__ import unicode_literals
from bs4 import BeautifulSoup
import requests
import json
import re
import datetime
from pymongo import MongoClient
connection = MongoClient()
View gist:7717afde4f391fc2e99c
### Keybase proof
I hereby claim:
* I am ahalterman on github.
* I am ahalt (https://keybase.io/ahalt) on keybase.
* I have a public key whose fingerprint is 5CEE CCB1 B548 682B B988 B999 952E E2B4 950D A417
To claim this, I am signing this object:
@ahalterman
ahalterman / Brazil-GKG.Rmd
Last active Jan 2, 2016
R Markdown for Brazilian Protest Themes in GKG Post
View Brazil-GKG.Rmd
Brazilian Protest Themes in the Global Knowledge Graph
========================================================
Andrew Halterman
Caerus Analytics
January 10, 2014
The Global Knowledge Graph, in the [words of Kalev Leetaru](http://gdeltblog.wordpress.com/2013/10/27/announcing-the-debut-of-the-gdelt-global-knowledge-graph/), aims to "connect every person, organization, location, count, theme, news source, and event across the planet into a single massive network that captures what’s happening around the world, what its context is and who’s involved, and how the world is feeling about it, every single day." Because GKG takes the form of a network with entities and themes as nodes and co-mentions as edges, the obvious way to work with it is as a network graph using tools from social network analysis. Kalev's [work on Iran](http://www.foreignpolicy.com/articles/2013/11/26/the_tehran_connection_big_data_iran) shows the remarkable ability of automated community detection algorithms to cluster people according to th
@ahalterman
ahalterman / gdelt.ortho.r
Created Aug 30, 2013
Script for reproducing map of August 29, 2013 GDELT coverage. Orthogonal map projection, centered on Cairo.
View gdelt.ortho.r
# Author: Andrew Halterman. 30 August 2013
# R script for reproducing map of August 29, 2013 GDELT coverage
# Orthogonal map projection, centered on Cairo
# This assumes that you have your GDELT data stored in a SQLite database.
# For instructions on setting up SQLite and dplyr, see http://gdeltblog.wordpress.com/2013/08/29/subsetting-and-aggregating-gdelt-using-dplyr-and-sqlite/
library(dplyr)
library(RSQLite)
library(RSQLite.extfuns)
@ahalterman
ahalterman / subset.domestic.r
Last active Dec 18, 2015
Subsetting GDELT for domestic events using R. I'm looking at domestic activities coded by GDELT, including protests. This is my walkthrough of how I subset only events occuring inside Georgia between 1979 and 2012 in the GDELT reduced dataset. 1. if you just use the python script to subset the full (reduced) dataset, you end up with only events …
View subset.domestic.r
# Example for subsetting domestic events in Georgia from the GDELT reduced dataset.
# Read in the python output file.
GEO.ALL <- read.table("./R/GDELT/GEO.ALL.select.outfile.txt",sep="\t", header=TRUE)
# The header=T command didn't work, so fix that:
names(GEO.ALL) <- c("Day","Actor1Code","Actor2Code","EventCode","QuadCategory","GoldsteinScale",
"Actor1Geo_Lat","Actor1Geo_Long","Actor2Geo_Lat","Actor2Geo_Long","ActionGeo_Lat","ActionGeo_Long")
# To keep our subsetting function manageable, prep the GEO.ALL dataframe by substringing the first
You can’t perform that action at this time.