Skip to content

Instantly share code, notes, and snippets.

@jseabold
Created April 4, 2012 00:55
Show Gist options
  • Save jseabold/2296801 to your computer and use it in GitHub Desktop.
Save jseabold/2296801 to your computer and use it in GitHub Desktop.
cut function for numpy
###########################################
# Note: this requires PR # 245 in numpy
# https://github.com/numpy/numpy/pull/245
#
# Author: Skipper Seabold
# License: BSD
import numpy as np
def cut(x, bins, right=True):
"""
Return indices of half-open bins to which each value of `x` belongs.
Parameters
----------
x : array-like
Input array to be binned. It has to be 1-dimensional.
bins : int or sequence of scalars
If `bins` is an int, it defines the number of equal-width bins in the
range of `x`. The range of `x`, however, is extended by .1% on each
side to include the min or max values of `x`. If `bins` is a sequence
it defines the bin edges allowing for non-uniform bin width.
right : bool
Indicates whether the bins include the rightmost edge or not. If
right == True (the default), then the bins [1,2,3,4] indicate
(1,2], (2,3], (3,4].
Returns
-------
out : ndarray of ints
Output array of indices, of same shape as `x`.
"""
if not np.iterable(bins):
if np.isscalar(bins) and bins < 1:
raise ValueError("`bins` should be a positive integer.")
if x.size == 0:
# handle empty arrays. Can't determine range, so use 0-1.
range = (0, 1)
else:
range = (x.min(), x.max())
mn, mx = [mi+0.0 for mi in range]
if mn == mx:
mn -= 0.5
mx += 0.5
bins = np.linspace(mn, mx, bins+1, endpoint=True)
bins[0] -= .1*mn
bins[-1] += .1*mx
else:
bins = np.asarray(bins)
if (np.diff(bins) < 0).any():
raise AttributeError(
'bins must increase monotonically.')
return np.digitize(x, bins, right)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment