Skip to content

Instantly share code, notes, and snippets.

@nwjlyons
Last active April 5, 2021 20:22
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save nwjlyons/7aeb3206250163dba137c4a6298259d5 to your computer and use it in GitHub Desktop.
Save nwjlyons/7aeb3206250163dba137c4a6298259d5 to your computer and use it in GitHub Desktop.
Regression to the mean
import random
from collections import Counter
def regression_to_the_mean(n):
data = [random.randint(1, 10) for x in range(n)]
return sum(data) // len(data)
# 10
Counter([regression_to_the_mean(10) for x in range(50)]).most_common()
>>> [(5, 23), (4, 14), (6, 6), (7, 4), (3, 3)]
# 100
Counter([regression_to_the_mean(100) for x in range(50)]).most_common()
>>> [(5, 41), (4, 5), (6, 4)]
# 1_000
Counter([regression_to_the_mean(1_000) for x in range(50)]).most_common()
>>> [(5, 50)]
@nwjlyons
Copy link
Author

nwjlyons commented Nov 7, 2020

The more data points, the less variance in the mean.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment