Skip to content

Instantly share code, notes, and snippets.

@drmingle
Last active April 30, 2018 06:39
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save drmingle/c49bf3cc2da94b2fd24f550f624ef69d to your computer and use it in GitHub Desktop.
Save drmingle/c49bf3cc2da94b2fd24f550f624ef69d to your computer and use it in GitHub Desktop.
This transformer turns lists of mappings (dict-like objects) of feature names to feature values into Numpy arrays or scipy.sparse matrices for use with scikit-learn estimators.
title author date
Loading Features From Dictionaries
Damian Mingle
04/30/2018

Preliminaries

from sklearn.feature_extraction import DictVectorizer

Create A Dictionary

staff = [{'name': 'John Oxboro', 'age': 23.},
         {'name': 'Regina Smith', 'age': 10.},
         {'name': 'Ollie Dyson', 'age': 28.},
        {'name': 'Ian McGrath', 'age': 48}]

Convert Dictionary To Feature Matrix

# Create an object for our dictionary vectorizer
vec = DictVectorizer()
# Fit then transform the staff dictionary with vec, then output an array
vec.fit_transform(staff).toarray()
array([[23.,  0.,  1.,  0.,  0.],
       [10.,  0.,  0.,  0.,  1.],
       [28.,  0.,  0.,  1.,  0.],
       [48.,  1.,  0.,  0.,  0.]])

View Feature Names

# Get Feature Names
vec.get_feature_names()
['age',
 'naem=Ian McGrath',
 'name=John Oxboro',
 'name=Ollie Dyson',
 'name=Regina Smith']
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment