Skip to content

Instantly share code, notes, and snippets.

@jamescalam
Created January 12, 2020 11:23
Show Gist options
  • Save jamescalam/3a6df1060ed1f2f59017bb8dee825f14 to your computer and use it in GitHub Desktop.
Save jamescalam/3a6df1060ed1f2f59017bb8dee825f14 to your computer and use it in GitHub Desktop.
Example code snippet for Naive Bayes fundamentals article, part [2]
# [2] now split into train/test set
# create our mask (70%)
mask = np.random.rand(len(dataset)) < 0.7
train = dataset[mask] # get 70% of samples from mask indices
test = dataset[~mask] # get other 30% of samples
# we also need to split the data based on whether person earns
# more than or less than 50K
less = train[train['income'] == '<=50K']
more = train[train['income'] == '>50K']
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment