Skip to content

Instantly share code, notes, and snippets.

@teoliphant
Last active October 13, 2016 03:49
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save teoliphant/4407fc8b3511e6baae1c1987d5a083cb to your computer and use it in GitHub Desktop.
Save teoliphant/4407fc8b3511e6baae1c1987d5a083cb to your computer and use it in GitHub Desktop.
Find a good linear relationship between data using the 3 Median Method.
My 11 year-old son is learning about regression in his 9th grade math class.
For them regression is two tables of data and a calculator button.
The graphing calculators also provide a button to find the best "median-fit" line and
the students were asked to find it as well as the regression line. The regression line can
easily be found with numpy.polyfit(x, y, 1).
I did not know of a function to calculate the best "median-fit" line. I had to review a few
online videos to learn exactly what a best "median-fit" line is and found the 3-median method
for determining the best "median-fit" line. It's sometimes called the median median fit.
I wrote this implementation to be sure that we understood how the calculator median-fit function
was actually working. Someone else might find it useful.
def median(values):
svalues = sorted(values)
N = len(values)
n, r = divmod(N, 2)
if r == 1:
return float(svalues[n])
else:
return (svalues[n-1]+svalues[n])/2.0
def medfit(x, y):
# Create 3 groups
N = len(x)
assert N == len(y)
n, r = divmod(N, 3)
n1 = n + (1 if r > 0 else 0)
n2 = n
n3 = n + (1 if r > 1 else 0)
assert (n1 + n2 + n3 == N)
grp1 = x[:n1], y[:n1]
grp2 = x[n1:n1+n2], y[n1:n1+n2]
grp3 = x[N-n3:], y[N-n3:]
# Summarize the three groups with 3 points
# defined as the median of the x points and the y points
# in each group.
pt1 = median(grp1[0]), median(grp1[1])
pt2 = median(grp2[0]), median(grp2[1])
pt3 = median(grp3[0]), median(grp3[1])
# Calculate the slope from the outer summary points
m = (pt3[1] - pt1[1]) / (pt3[0] - pt1[0])
# Intercept of that line
b0 = pt1[1] - m * pt1[0]
# Correct the intercept using the point 2
# by moving 1/3 of the way towards point 2
y2 = m*pt2[0] + b0
b = b0 + (pt2[1]-y2)/3.0
# The slobe and intercept of the best median-fit line to the data
return m, b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment