Skip to content

Instantly share code, notes, and snippets.

@ollie
Last active August 29, 2015 14:23
Show Gist options
  • Save ollie/7ea360192ff6e2e1f79e to your computer and use it in GitHub Desktop.
Save ollie/7ea360192ff6e2e1f79e to your computer and use it in GitHub Desktop.
Percentile, mean and deviations.
# A bunch of statistics-related methods.
#
# @example
# p Math::Stats.percentile(40, [15, 20, 35, 40, 50]) == 29.0
# p Math::Stats.percentile(75, [1, 2, 3, 4]) == 3.25
# p Math::Stats.median([15, 20, 35, 40, 50]) == 35.0
# p Math::Stats.median([1, 2, 3, 4]) == 2.5
# p Math::Stats.sum([15, 20, 35, 40, 50]) == 160
# p Math::Stats.mean([15, 20, 35, 40, 50]) == 32.0
# p Math::Stats.sample_variance([15, 20, 35, 40, 50]) == 207.5
# p Math::Stats.standard_deviation([15, 20, 35, 40, 50]).round(1) == 14.4
module Math
module Stats
module_function
# Calculate p-th percentile.
# https://en.wikipedia.org/wiki/Percentile
#
# @example
# p Math::Stats.percentile(40, [15, 20, 35, 40, 50]) == 29.0
# p Math::Stats.percentile(75, [1, 2, 3, 4]) == 3.25
#
# @param p [Fixnum] Target percentile, number from 0 to 100 %.
# @param values [Enumerable] Enumerable with numbers.
#
# @return [Float, nil] Value below which a given percentage of observations
# in a group of observations fall.
def percentile(p, values)
return 0 if values.empty?
values = values.sort
values_size = values.size
return values.first if values_size == 1
return values.last if p == 100
rank = (p / 100.0) * (values_size - 1)
rank_integer, rank_remainder = rank.divmod(1)
lower, upper = values[rank_integer, 2]
lower + rank_remainder * (upper - lower)
end
# Calculate 50th percentile, a.k.a median.
# https://en.wikipedia.org/wiki/Percentile
#
# @example
# p Math::Stats.median([15, 20, 35, 40, 50]) == 35.0
# p Math::Stats.median([1, 2, 3, 4]) == 2.5
#
# @param p [Fixnum] Target percentile, number from 0 to 100 %.
# @param values [Enumerable] Enumerable with numbers.
#
# @return [Float, nil] Value below which a given percentage of observations
# in a group of observations fall.
def median(values)
percentile(50, values)
end
# Add all elements up.
#
# @example
# p Math::Stats.sum([15, 20, 35, 40, 50]) == 160
#
# @param values [Enumerable] Enumerable with numbers.
#
# @return [Numeric]
def sum(values)
values.reduce(0) { |a, e| a + e }
end
# Calculate mean (average) of elements.
#
# @example
# p Math::Stats.mean([15, 20, 35, 40, 50]) == 32.0
#
# @param values [Enumerable] Enumerable with numbers.
#
# @return [Float]
def mean(values)
sum(values) / values.size.to_f
end
# Calculate sample variance.
#
# @example
# p Math::Stats.sample_variance([15, 20, 35, 40, 50]) == 207.5
#
# @param values [Enumerable] Enumerable with numbers.
#
# @return [Float]
def sample_variance(values)
mean = self.mean(values)
sum = values.reduce(0) { |a, e| a + (e - mean)**2 }
sum / (values.size - 1).to_f
end
# Calculate standard deviation.
#
# @example
# p Math::Stats.standard_deviation([15, 20, 35, 40, 50]).round(1) == 14.4
#
# @param values [Enumerable] Enumerable with numbers.
#
# @return [Float]
def standard_deviation(values)
Math.sqrt(sample_variance(values))
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment