Skip to content

Instantly share code, notes, and snippets.

@glemaitre
Created March 14, 2017 23:22
Show Gist options
  • Save glemaitre/d42ce13eb32d5c0576f6f6c67042ad18 to your computer and use it in GitHub Desktop.
Save glemaitre/d42ce13eb32d5c0576f6f6c67042ad18 to your computer and use it in GitHub Desktop.
import numpy as np
from sklearn.preprocessing import QuantileTransformer
X = np.array([0] * 1 + [0.5] * 7 + [1] * 2).reshape(-1, 1)
qt = QuantileTransformer(n_quantiles=10)
qt.fit(X)
# a behaviour which is not desired, but that frankly should
# not happen will be the following
print('0.5 is mapped to {}'.format(qt.transform(0.5)))
print('0.4999999 is mapped to {}'.format(qt.transform(0.499999)))
# the two values are mapped far from each other since 0.5
# will be mapped to the greater quantiles.
# a solution is to add a small noise while computing the
# quantiles, making the operation more stable.
qt = QuantileTransformer(n_quantiles=10, smoothing_noise=1e-7)
qt.fit(X)
# a behaviour which is not desired, but that frankly should
# not happen will be the following
print('0.5 is mapped to {}'.format(qt.transform(0.5)))
print('0.4999999 is mapped to {}'.format(qt.transform(0.499999)))
# however, this case is unlikely to happen in real-world dataset
# and that's why we chose to put the smoothing_noise parameter
# to None as default value.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment