Skip to content

Instantly share code, notes, and snippets.

@u8sand
Last active November 23, 2019 16:50
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save u8sand/84ea2e108a9483148b30 to your computer and use it in GitHub Desktop.
Save u8sand/84ea2e108a9483148b30 to your computer and use it in GitHub Desktop.
Downsample lines in a file, useful for csv's that are too big.
#!/bin/python
'''
Usage:
python downsample.py [offset+]amount
Examples:
cat super_big.csv | python downsample.py 1+4 > big_divided_by_4.csv
cat data.csv | python downsample.py 2 > data_halved.csv
'''
import sys
from itertools import islice
try:
if len(sys.argv) != 2:
raise Exception("Accepts one and only one argument")
args = sys.argv[1].split('+')
if len(args) == 2:
offset, amount = map(int, args)
elif len(args) == 1:
offset, (amount,) = 0, map(int, args)
else:
raise Exception("Only one `+` should be specified")
for line in islice(sys.stdin, offset, None, amount):
sys.stdout.write(line)
except Exception as e:
print(e, file=sys.stderr)
print('Must specify downsample amount ([offset+]amount)', file=sys.stderr)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment