Skip to content

Instantly share code, notes, and snippets.

View johnmyleswhite's full-sized avatar

John Myles White johnmyleswhite

View GitHub Profile
@johnmyleswhite
johnmyleswhite / median.R
Created February 23, 2017 00:40
R's Medians as a Rabbit Hole of Type Promotions and Function Indirection
> median(FALSE)
[1] FALSE
> median(c(TRUE, FALSE))
[1] 0.5
> median(c(TRUE, FALSE, TRUE))
[1] TRUE
> f <- factor(c('a', 'b', 'c'), levels = c('a', 'b', 'c'), ordered = TRUE)
@johnmyleswhite
johnmyleswhite / lifting_in_julia.md
Created February 14, 2017 17:24
Overview of Lifting in Julia

Handling Nulls in Julia

There are several core cases that need to be handled to ensure that all operations on non-nullable data can be lifted to work on nullable data.

  • The scalar case: lift f(x::T) to f(x::?T).
  • Tuple case: lift f(xs::Tuple{T1, ..., TN}) to f(x::Tuple{?T1, ..., ?Tn}).
  • Single array case: lift f(xs::Array{T}) to f(xs::Array{?T}).
  • Multiple array case: lift f(xs::Tuple{Array{T1}, ..., Array{Tn}}) to f(xs::Tuple{Array{?T1}, ..., Array{?Tn}})

Others may exist, but these are the obvious cases that we absolutely must resolve.

@johnmyleswhite
johnmyleswhite / cor_means.jl
Created January 17, 2017 16:47
Sampling distribution of means of uncorrelated vectors
using Distributions
p = 100
d_x = MultivariateNormal(zeros(p), eye(p, p))
d_y = MultivariateNormal(100 * ones(p), 2 * eye(p, p))
n_sims = 10_000
mean_x = Array(Float64, n_sims)
# Ground rules:
#
# (1) Must collect at least 10 observations
# (2) After that, keep collecting observations until
# we satisfy condition: s / sqrt(n) < tolerance
# (3) Then output a tuple of s and n.
using Distributions
function run_simulation()
> binom.test(2, 12)
Exact binomial test
data: 2 and 12
number of successes = 2, number of trials = 12, p-value = 0.03857
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
0.02086253 0.48413775
sample estimates:
@johnmyleswhite
johnmyleswhite / keyword_arguments.jl
Created September 17, 2016 21:45
Keyword Arguments and Method Specialization
julia> function foo(x; y = 0)
return x + y
end
foo (generic function with 1 method)
julia> foo(0, y = 1)
1
julia> foo(0, y = 1.0)
1.0
import Distributions: Binomial
function clt_ci(x)
m = mean(x)
s = std(x)
n = length(x)
se = s / sqrt(n)
return max(0.0, m - 1.96 * se), min(1.0, m + 1.96 * se)
end

Using a 64-bit floating point number (AKA a double), what is the smallest integer n such that?

  • 0.1^n == 0.0
  • 10.0^n == Inf
  • exp(-n) == 0.0
  • exp(n) == Inf

I frequently forget these numbers, so here they are:

  • 0.1^n == 0.0: n = 324
import Distributions: LogNormal, Normal, cdf, cquantile
function coverage(cis, truth)
s = 0
n = length(cis)
for ci in cis
s += ci[1] <= truth <= ci[2]
end
return s / n
end

Univariate API

A new univariate distribution type should implement all of the following methods:

  • Core constructors
    • MyDistribution{T}(args[...])
    • We need to clarify whether constructors should handle input validation or not. There are use cases in which people want to avoid input validation.
  • params(d::MyDistribution{T})::Tuple: A tuple of the distribution's parameters in our canonical order.
  • minimum(d::MyDistribution{T})::T: The lowest value in the support of MyDistribution.
  • maximum(d::MyDistribution{T})::T: The highest value in the support of MyDistribution.