Skip to content

Instantly share code, notes, and snippets.

@isoboroff
isoboroff / gist:5775326
Created June 13, 2013 16:46
This is a Python script to draw random lines from text files. The key application is where those files are much bigger than RAM and when you really don't want to read the entire file. It works by randomly seeking around in the files, then outputting the next full line. I am concerned that random.randrange(), random.randint(), file.seek(), and fi…
#!/usr/bin/env python2.7
import os
import random
import sys
import argparse
parser = argparse.ArgumentParser(description = 'Print random lines from a file')
parser.add_argument('-n', dest='sample_size', type=int, help='number of lines to sample', default=100)
parser.add_argument('files', nargs=argparse.REMAINDER, help='files to read from')