Last active
December 30, 2015 02:59
-
-
Save tbrosman/7766095 to your computer and use it in GitHub Desktop.
Classify a set of points against a hyperplane formed using the mean and the largest principle component.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Classify a set of points by: | |
# 1. Calculating largest principle component. This is the largest eigenvector of the covariance matrix for the distribution. | |
# 2. Defining the classification plane as the affine hyperplane passing through the mean with the previously calculated eigenvector as its normal. | |
# 3. Projecting onto the normal and adding each point to a "positive" or "negative" list depending on where the projection lies with respect to the normal. | |
require 'matrix' | |
class PointClassifier | |
def self.mean(points) | |
return (1.0 / points.length) * points.inject { |total, x| total + x } | |
end | |
def self.cov_matrix(points) | |
# Find the mean of all the input points and calculate the deviations | |
mean = mean points | |
vectors = points.map { |x| x - mean } | |
# Create a matrix to accumulate covariance values | |
dim = points[0].size | |
unscaledCovMatrix = Array.new(dim) { Array.new(dim, 0.0) } | |
# Calculate the outer product x*x^T where x is p - mean | |
for x in vectors | |
for i in 0...dim | |
for j in 0...dim | |
unscaledCovMatrix[i][j] += x[i] * x[j] | |
end | |
end | |
end | |
# Return the scaled covariance matrix | |
k = Matrix.scalar(dim, 1.0 / (points.length - 1)) | |
return k * (Matrix.columns unscaledCovMatrix) | |
end | |
def self.principle_axis(covMatrix) | |
v, d, v_inv = covMatrix.eigensystem | |
# Columns of v are eigenvectors of a while d contains the corresponding eigenvalues | |
dim = covMatrix.column_size | |
largest = (0...dim).max_by { |x| d[x, x].abs } | |
axis = v.column(largest) | |
return axis | |
end | |
def self.classify_with_hyperplane(origin, normal, points) | |
positive = [] | |
negative = [] | |
# Project all points onto the normal of the hyperplane | |
originDotN = normal.inner_product origin | |
for p in points | |
# Classify each point against the origin | |
pDotN = normal.inner_product p | |
if pDotN >= originDotN | |
positive << p | |
else | |
negative << p | |
end | |
end | |
return positive, negative | |
end | |
end | |
# Test data | |
points = [[1, 2], [2, 1], [3, 1], [3, 4]].map { |x| Vector.elements x } | |
cov_matrix = PointClassifier.cov_matrix points | |
axis = PointClassifier.principle_axis cov_matrix | |
mean = PointClassifier.mean points | |
puts "mean = #{mean}" | |
puts "axis = #{axis}" | |
positive, negative = PointClassifier.classify_with_hyperplane(mean, axis, points) | |
puts "positive = #{positive}" | |
puts "negative = #{negative}" |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment