Skip to content

Instantly share code, notes, and snippets.

@msjamali52
Last active September 25, 2019 09:08
Show Gist options
  • Save msjamali52/68007ab59331ed66e1ab6e8557d690d6 to your computer and use it in GitHub Desktop.
Save msjamali52/68007ab59331ed66e1ab6e8557d690d6 to your computer and use it in GitHub Desktop.
Remove Features with Low Variance using VarianceThreshold
import pandas as pd
import numpy as np
from sklearn.feature_selection import VarianceThreshold
data = pd.DataFrame({'A':[1,0,0,0,0,0,0],'B':[0,1,1,0,1,1,0],'C':[0,0,0,1,1,1,0],'Str1':['l1','l2','l3','l2,l3','l3,l1','l3','l3']})
selector = VarianceThreshold(threshold=(.8 * (1 - .8)))
selector.fit(data.select_dtypes(include=[np.number]))
data[data.columns[selector.get_support(indices=True)]].join(data.select_dtypes(exclude=[np.number]))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment