Skip to content

Instantly share code, notes, and snippets.

@raven4752
Created April 4, 2018 08:20
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save raven4752/317a4c0e94c998d6a00d2ce9c85ad647 to your computer and use it in GitHub Desktop.
Save raven4752/317a4c0e94c998d6a00d2ce9c85ad647 to your computer and use it in GitHub Desktop.
a function to split train validation and test set
#from https://stackoverflow.com/questions/38250710/how-to-split-data-into-3-sets-train-validation-and-test
def train_validate_test_split(df, train_percent=.6, validate_percent=.2, seed=None):
np.random.seed(seed)
perm = np.random.permutation(df.index)
m = len(df.index)
train_end = int(train_percent * m)
validate_end = int(validate_percent * m) + train_end
train = df.ix[perm[:train_end]]
validate = df.ix[perm[train_end:validate_end]]
test = df.ix[perm[validate_end:]]
return train, validate, test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment