Skip to content

Instantly share code, notes, and snippets.

@wwright999
Created September 15, 2017 21:52
Show Gist options
  • Save wwright999/a04647e43f91d41bbd168e3d036c3dcb to your computer and use it in GitHub Desktop.
Save wwright999/a04647e43f91d41bbd168e3d036c3dcb to your computer and use it in GitHub Desktop.
Example of a resampling permutation test to determine the statistical significance or p-value of an observed difference between the effects of a drug versus a placebo:
# Example from "Statistics is Easy" by Dennis Shasha and Manda Wilson
from statistics import mean
from random import shuffle
drug = [54, 73, 53, 70, 73, 68, 52, 65, 65]
placebo = [54, 51, 58, 44, 55, 52, 42, 47, 58, 46]
observed_diff = mean(drug) - mean(placebo)
n = 10000
count = 0
combined = drug + placebo
for i in range(n):
shuffle(combined)
new_diff = mean(combined[:len(drug)]) - mean(combined[len(drug):])
count += (new_diff >= observed_diff)
print(f'{n} label reshufflings produced only {count} instances with a difference')
print(f'at least as extreme as the observed difference of {observed_diff:.1f}.')
print(f'The one-sided p-value of {count / n:.4f} leads us to reject the null')
print(f'hypothesis that there is no difference between the drug and the placebo.')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment