Skip to content

Instantly share code, notes, and snippets.

@conradlee
Created November 18, 2011 14:27
Show Gist options
  • Save conradlee/1376583 to your computer and use it in GitHub Desktop.
Save conradlee/1376583 to your computer and use it in GitHub Desktop.
Bin (discretize) data points for seeding mean shift clustering method
import numpy as np
from collections import defaultdict
def bin_points(X, bin_size, min_bin_freq):
bin_sizes = defaultdict(int)
for point in X:
binned_point = np.cast[np.int32](point / bin_size)
bin_sizes[tuple(binned_point)] += 1
bin_seeds = np.array([point for point, freq in bin_sizes.iteritems() if freq >= min_bin_freq], dtype=np.float32)
bin_seeds = bin_seeds * bin_size
return bin_seeds
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment