Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Splitting an array to training and test sets for machine learning
def train_test_split(total_set, test_set_size = 0.25)
if test_set_size > 1.0
test_set_size = 1.0
elsif test_set_size < 0
test_set_size = 0.0
end
test_set_count = (total_set.length * test_set_size).floor
if test_set_count == 0
raise StandardError, "Test size resulted in a test set of 0. Increase the test size."
elsif test_set_count == total_set.length
raise StandardError, "Test size resulted in a training set of 0. Decrease the test size."
end
total_set.shuffle!
test_set = total_set[0..test_set_count]
training_set = total_set[test_set_count+1..total_set.length]
return training_set, test_set
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment