Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
To Split and Gzip at the same time
#!/usr/bin/env gawk -f
BEGIN { id = 0;
cmd = "gzip -c -2";
ext = ".gz";
file = sprintf("%04d%s",id, ext);
print "Opening new file " file " at " NR " rows";
count = 1000000;
}
# Use pipes
{ print | cmd " > " file }
# Close pipe every 100k lines
NR % count == 0 {
close(cmd " > " file );
id = id + 1;
file = sprintf("%04d%s",id,ext);
print "Opening new file " file " at " NR " rows";
}
END {
print "Ending stream at " NR " rows"
# pipes are automatically closed
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.