Skip to content

Instantly share code, notes, and snippets.

View dmarx's full-sized avatar

David Marx dmarx

View GitHub Profile
@dmarx
dmarx / Makefile
Created June 21, 2017 20:31
Minimal working example for stackoverlfow question
DIRS := $(filter dir%, $(shell ls))
foo_sources := $(wildcard */source/foo.a)
foo_targets_prt := $(patsubst %.a, %.b, $(foo_sources))
foo_targets := $(subst source,target, $(foo_targets_prt))
bar_sources := $(wildcard */source/bar.a)
bar_x := $(patsubst %/bar.a, %/Y.a, $(bar_sources))
bar_y := $(patsubst %/bar.a, %/Z.a, $(bar_sources))
bar_targets := $(bar_x) $(bar_y)
@dmarx
dmarx / simple regression to measure effect of a regime change.r
Last active June 6, 2017 01:12
Simple regression with interaction terms to measure effect of a regime change on the predictors. Implementation of https://stats.stackexchange.com/a/99432/8451
#' ---
#' title: "Regression for quantifying a regime change"
#' author: "David Marx"
#' date: "June 5, 2017"
#' output: html_document
#' ---
#' There are two time points of interest. We want to test the hypothesis that the regression
#' coefficients changed after these time points, respectively. We will accomplish this by introducing
#' dummy variables to denote whether we are before or after a particular change point. This approach
@dmarx
dmarx / Arxiv Archive.md
Last active April 18, 2019 23:03
Machine learning articles I want to read or have read, mostly arxiv.org articles discussing recent advancements in deep learning.

To Read:

Publication Date Article Notes
2016 End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures Cited in multi-task sciERC (2018, below)
2018-10-11 BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction Probably a lot of useful citations in here, not sure we need the coreference stuff.
* SciERC datasets: http://nlp.cs.washington.edu/sciIE/
* Code: https://bitbucket.org/luanyi/scierc/src/master/
* Pretrained (best) models: NER, Coref, Relation
2017-08-08 [Structural
@dmarx
dmarx / chinese restuarant process.R
Created April 3, 2017 22:56
Demonstration of a Chinese Restaurant Process, with an optional parameter to push the tables towards a uniform distribution rather than dirichlet (i.e. preferential attachment)
# chinese restuarant process
chinese_restaurant = function(n, uniform=FALSE){
tables = c(1) # running counts of people at tables. Start by seating first person at their own table
U = runif(n)
for (i in 2:n){
if(U[i]<1/i){
tables = c(tables, 1)
} else {
p = tables/(i) # sum(tables) = i-1
@dmarx
dmarx / edge_weight_null_distribution.r
Created January 30, 2017 01:37
Simulate null hypothesis distribution for Serrano's disparity filter
generate_distances = function(k){
u_k = c(0,sort(runif(k-1)),1)
u_k[-1] - u_k[-(k+1)]
}
iters=1e4
d = c(replicate(iters, generate_distances(2)))
plot(density(d), ylim=c(0,5))
#abline(v=mean(d), lty=2)
@dmarx
dmarx / disparity_filter_dt.r
Last active December 19, 2017 12:00
Modified Alessandro Bessi's r implementation of Serrano's Disparity Filter to utilize the data.table package, imbuing orders of magnitude performance gains on calculation time (1.3 seconds for 543k nodes). Need to turn into a pull request or package fork. Original code: https://github.com/alessandrobessi/disparityfilter
#' Extract the backbone of a weighted network using the disparity filter
#'
#' Given a weighted graph, \code{backbone} identifies the 'backbone structure'
#' of the graph, using the disparity filter algorithm by Serrano et al. (2009).
#' @param graph The input graph.
#' @param weights A numeric vector of edge weights, which defaults to
#' \code{E(graph)$weight}.
#' @param directed The directedness of the graph, which defaults to the result
#' of \code{\link[igraph]{is_directed}}.
#' @param alpha The significance level under which to preserve the edges, which
@dmarx
dmarx / venn_intersection_text.R
Last active January 12, 2017 22:15
Rough method for drawing labels in intersections of a venn diagram drawn using R's `venneueler` package
#install.packages('venneuler')
library(venneuler)
venn_intersection_text = function(venn, classes, label, adjustment=0.5, xadj=0, yadj=0 ){
# fits a line between the centers of two classes and draws label text at the midpoint of that line + adjustment
xv = adjustment*venn$centers[classes[1],1] + (1-adjustment)*venn$centers[classes[2],1] + xadj
yv = adjustment*venn$centers[classes[1],2] + (1-adjustment)*venn$centers[classes[2],2] + yadj
text(x=xv, y=yv, labels=label)
}
@dmarx
dmarx / undirected_to_directed_bipartite_projection.R
Last active January 3, 2017 20:54
Novel (?) technique for inferring a directed bipartite projection from an undirected bipartite graph. Code is for pedagogical demonstration to accompany the article here: http://dmarx.github.io/map-of-reddit-by-active-users/
library(igraph)
# Experiment parameters
n=10 # Primary class (i.e. subreddits)
m=100 # Secondary class (i.e. users)
threshold = .5 # edge threshold
######################################
seed(123)
@dmarx
dmarx / election_rage.py
Last active November 8, 2016 22:10
Get a pulse on the election-relevant conversation on reddit by streaming sentences containing some relevant terms
import praw
import string
import re
import nltk
r = praw.Reddit('anger fuel comment monitor, by /u/shaggorama')
targets = ['hillary', 'trump', 'hilary', 'election']
punc_pat = re.compile('['+string.punctuation+']')
blacklist = ['AutoModerator', '2016VoteBot']
@dmarx
dmarx / dynamic_edgelist_demo.r
Created September 2, 2016 20:57
Given an edgelist of a dynamic graph in the form of (timestamp, source, target) triplets, construct a compressed edgelist in the form (onset, terminus, source, target)
#' Try to construct a dynamic graph object from an edgelist with sequential timestamps, to use render.d3movie per:
#' https://rpubs.com/kateto/netviz
#'
#install.packages('statnet')
#install.packages("ndtv")
library(igraph)
library(statnet)
library(ndtv)