Skip to content

Instantly share code, notes, and snippets.

@mamelara
Created March 30, 2017 17:56
Show Gist options
  • Save mamelara/adb5dfd28676f91e073d2e5034355e1f to your computer and use it in GitHub Desktop.
Save mamelara/adb5dfd28676f91e073d2e5034355e1f to your computer and use it in GitHub Desktop.
Find the null distribution
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
population = pd.read_csv("https://raw.githubusercontent.com/genomicsclass/dagdata/master/inst/extdata/femaleMiceWeights.csv")
control = population[population["Diet"] == "chow"]
treatment = population[population["Diet"] == "hf"]
obs = treatment["Bodyweight"].mean() - control["Bodyweight"].mean()
null_distribution = []
for i in range(10000):
control = population.sample(12)
treatment = population.sample(12)
null_distribution.append((treatment["Bodyweight"].mean() -
control["Bodyweight"].mean()))
def is_greater_than_obs(num):
return num > obs
def mean(lst):
sum(lst)/len(lst)
p_val= mean(filter(is_greater_than_obs, null_distribution))
print("P-value={0}".format(p_val))
plt.hist(null_distribution)
plt.title("Null distribution for female mice")
plt.xlabel("Mean")
plt.xlabel("Frequency")
plt.show()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment