Skip to content

Instantly share code, notes, and snippets.

@jeresuikkila
Last active August 21, 2016 17:48
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jeresuikkila/7e59b57e30393c14403e84c597cf44f7 to your computer and use it in GitHub Desktop.
Save jeresuikkila/7e59b57e30393c14403e84c597cf44f7 to your computer and use it in GitHub Desktop.
Splitting an array to training and test sets for machine learning
def train_test_split(total_set, test_set_size = 0.25)
if test_set_size > 1.0
test_set_size = 1.0
elsif test_set_size < 0
test_set_size = 0.0
end
test_set_count = (total_set.length * test_set_size).floor
if test_set_count == 0
raise StandardError, "Test size resulted in a test set of 0. Increase the test size."
elsif test_set_count == total_set.length
raise StandardError, "Test size resulted in a training set of 0. Decrease the test size."
end
total_set.shuffle!
test_set = total_set[0..test_set_count]
training_set = total_set[test_set_count+1..total_set.length]
return training_set, test_set
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment