Skip to content

Instantly share code, notes, and snippets.

@smac89
Last active November 29, 2016 05:45
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save smac89/bddb27d975c59a5f053256c893630cdc to your computer and use it in GitHub Desktop.
Save smac89/bddb27d975c59a5f053256c893630cdc to your computer and use it in GitHub Desktop.
Efficient method to read words from a file.
import itertools
def readwords(file_object):
byte_stream = itertools.groupby(
itertools.takewhile(lambda c: bool(c),
itertools.imap(file_object.read,
itertools.repeat(1))), str.isspace)
return ("".join(group) for pred, group in byte_stream if not pred)
# Example usage
import sys
if __name__ == '__main__':
# read from a user file
with open(sys.argv[1], 'r') as f:
for w in readwords(f):
print (w)
# read from stdin
for w in readwords(sys.stdin):
print (w)
@smac89
Copy link
Author

smac89 commented Jun 11, 2016

Replace itertools.imap with map if running this on python version 3

The function assumes words are separated by space characters. By space characters, I assume the same as contained in string.whitespace

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment