Skip to content

Instantly share code, notes, and snippets.

@cyclecycle
Last active August 18, 2020 14:05
Show Gist options
  • Save cyclecycle/53c83e3c7c40dbaa499277d48250a1a0 to your computer and use it in GitHub Desktop.
Save cyclecycle/53c83e3c7c40dbaa499277d48250a1a0 to your computer and use it in GitHub Desktop.
import gzip
import jsonlines # pip install jsonlines
def load_jsonlines(path):
    is_gzip = path[-2:] == 'gz'
    if is_gzip:
        with gzip.open(path, 'rb') as f:
            reader = jsonlines.Reader(f)
            for obj in reader:
                yield obj
    else:
        with jsonlines.open(path) as reader:
            for obj in reader:
                yield obj
data = load_jsonlines('path/to/file.jsonl.gz')
# `data` is a generator, so optionally cast to list to load it all into memory:
data = list(data)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment