Skip to content

Instantly share code, notes, and snippets.

@rindeal
Last active April 19, 2024 09:01
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rindeal/a6bbd5078f41b573e53f836c3e5da192 to your computer and use it in GitHub Desktop.
Save rindeal/a6bbd5078f41b573e53f836c3e5da192 to your computer and use it in GitHub Desktop.
Python 3 function to extract/unpack TAR archive file directly from web URL in one continous stream.
# SPDX-FileCopyrightText: 2018 Jan Chren (rindeal)
# SPDX-License-Identifier: GPL-2.0-only OR GPL-3.0-only
import pathlib
import urllib
import shutil
import tarfile
def fetch_tar(url: str, dest_dir: pathlib.Path, clean_dest: bool = False, strip_components: int = 0):
if clean_dest and dest_dir.exists():
shutil.rmtree(dest_dir)
if not dest_dir.exists():
dest_dir.mkdir(parents=True, exist_ok=True)
with urllib.request.urlopen(url) as resp:
with tarfile.open(fileobj=resp, mode='r|*') as tar:
for tarinfo in tar.next():
if strip_components:
tarinfo.name = pathlib.Path(*pathlib.Path(tarinfo.name).parts[strip_components:])
tar.extract(tarinfo, path=dest_dir)
# clean member cache to save on memory usage and processing time wasted on managing this list
tar.members = []
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment