Skip to content

Instantly share code, notes, and snippets.

View prabhasp's full-sized avatar

Prabhas Pokharel prabhasp

View GitHub Profile
@prabhasp
prabhasp / index.Rmd
Last active December 14, 2015 12:18
Power Law Checks from formhub top-100 users
We'll be checking whether the top 100 users of formhub follow a power law distribution, or a log-normal distribution, using guidance from [a blog post by CMU stats prof. Cosma Rohilla Shaliz](http://vserver1.cscs.lsa.umich.edu/~crshalizi/weblog/857.html).
First, load up the dataset; I have included it here in case you want to re-produce it. Each value is a number of submissions on formhub for top 100 users.
```{r}
source("~/Downloads/pli-R-v0.0.3-2007-07-25/pareto.R")
source("~/Downloads/pli-R-v0.0.3-2007-07-25/lnorm.R")
source("~/Downloads/pli-R-v0.0.3-2007-07-25/power-law-test.R")
users <- c(220346L, 31099L, 28568L, 16573L, 14862L, 7531L, 6510L, 6138L,
@prabhasp
prabhasp / unnamed-chunk-1.png
Last active December 11, 2015 02:19
Coverage analysis for Nigeria -- generated by Rmarkdown.
unnamed-chunk-1.png

KTMJS Talk by @prabhasp

<!DOCTYPE html>
<html>
<head>
<title>Foo</title>
<meta charset='utf-8' />
<meta name='viewport' content='width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=0' />
<style type='text/css'>
body {
background:#000;
color:#fff;
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@prabhasp
prabhasp / index.Rmd
Last active August 29, 2015 14:00
Indicator Dependencies: R metaprogramming
<link href="http://kevinburke.bitbucket.org/markdowncss/markdown.css" rel="stylesheet">
Finding indicator dependencies
========================================================
I had been meaning to look into R's metaprogramming features for a while now. I finally had a chance today (thanks [Hadley](http://adv-r.had.co.nz/)!), and I used it to experiment towards a problem that had been in the back of my mind for a while: finding depencies within indicator definitions.
Below, I implement a find dependencies function, which takes a set of indicators, and finds dependencies within it. Indicators are fields within a dataset, some of which are already there, and some of which are newly created. The dependency finding problem is investigating which new indicators derive from which existing ones. We think of these relationships as dependencies: for an indicator such as pupil-to-teacher-ratio (defined as the number-of-pupils divided by the number-of-teachers), pupil-to-teacher-ratio is dependent on number-of-pupils
@prabhasp
prabhasp / PlotDKs.R
Created April 17, 2014 20:41
Plot the percent of don't know per question in two ossap surveys.
get_dk_reason <- function(df) {
dk <- names(df)[str_detect(names(df), "dont")]
llply(df[dk], function(x) { as.character(na.exclude(x)) })
}
plot_percent_dks <- function(dk_list, N) {
# d will be a list of question name and length
d <- ldply(dk_list, length)
# order the data frame
d <- arrange(d, V1)
# divide by N (which is supposed to be total responses