Skip to content

Instantly share code, notes, and snippets.

@dominictarr
Created September 25, 2014 12:58
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dominictarr/9d23e7e924e957c89f5b to your computer and use it in GitHub Desktop.
Save dominictarr/9d23e7e924e957c89f5b to your computer and use it in GitHub Desktop.
What is the Chi Squared Test?

The Chi Squared test is used when you want to decide if a dice is fair (random) or not, or for problems that fit that pattern. Sometimes we want to know that some event is equally likely, a the probability that a dice comes up 6 should be equal to the probability that it comes up 1. Some times we want to show the opposite, that two categories are actually different. Say, there are a series of races - is the winner of the most races actually better, or was it just a fluke?

Of course, we can roll a dice many, many times until we are sure it's fair, but it takes too long to have many races. So for the race need to calculate whether the random variables (the times that each participant won) are independent (random, fair) while only looking at a few examples. Depending on the number of participants, how many times do we need to race before we know that the winner is actually faster?

This is when we use the chi squared test.

Or, suppose there is a company with an anti-descrimination policy (they intend to hire male or female fairly) They profess to pick each hire fairly, making their decision according only to each individual's suitability for the role. Lets say they have 10 employees and 7 are men and 3 are women. It seems like they might be hiring more men, what is the probability that you'd get 7 of one and 3 of the other if you made 10 fair decisions? (note, if they had 700 men and 300 women it would be clear they have hired more women)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment