Skip to content

Instantly share code, notes, and snippets.

@claymcleod
Created October 17, 2015 22:16
Show Gist options
  • Save claymcleod/50e58987a0dae2c04453 to your computer and use it in GitHub Desktop.
Save claymcleod/50e58987a0dae2c04453 to your computer and use it in GitHub Desktop.
import numpy as np
import pandas as pd
from sklearn.preprocessing import StandardScaler, LabelEncoder
df = pd.read_csv('data.csv')
df = df.fillna(0)
for (_col, _dtype) in zip(df.columns, df.dtypes):
_dtype_as_str = str(_dtype)
if 'int' in _dtype_as_str or 'float' in _dtype_as_str:
df[_col] = df[_col].astype('float32')
df[_col] = StandardScaler().fit_transform(df[_col])
else:
df[_col] = LabelEncoder().fit_transform(df[_col])
df.to_csv("processed_data.csv")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment