Skip to content

Instantly share code, notes, and snippets.

@andreleoni
Last active July 31, 2020 20:06
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save andreleoni/77cb9d71f70af91a6ce673e29146459c to your computer and use it in GitHub Desktop.
Save andreleoni/77cb9d71f70af91a6ce673e29146459c to your computer and use it in GitHub Desktop.
Separate CSV in slices
require "csv"
current = 0
read_to_array = CSV.read("all.csv", :headers => true)
unique_user_ids = read_to_array.uniq { |u| u["user_id"] }
unique_user_ids.each_slice(10000) do |rows|
CSV.open("slice-#{current}.csv", "w") do |csv|
rows.each do |row|
csv << row
print "."
current += 1
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment