Skip to content

Instantly share code, notes, and snippets.

@baatout
Last active September 8, 2018 11:06
Show Gist options
  • Save baatout/5c71b7b74a06d3dac0360f4b6f47a52f to your computer and use it in GitHub Desktop.
Save baatout/5c71b7b74a06d3dac0360f4b6f47a52f to your computer and use it in GitHub Desktop.
Train/test split
from pandas import read_csv
from sklearn.model_selection import train_test_split
url = "https://raw.githubusercontent.com/baatout/ml-in-prod/master/pima-indians-diabetes.csv"
features = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age']
label = 'label'
dataframe = read_csv(url, names=features + [label])
X = dataframe[features]
Y = dataframe[label]
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.33, random_state=42)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment