Skip to content

Instantly share code, notes, and snippets.

@tbnorth
Last active June 14, 2024 12:21
Show Gist options
  • Save tbnorth/58b90b3fc2d7edbdbe37ad52605cae80 to your computer and use it in GitHub Desktop.
Save tbnorth/58b90b3fc2d7edbdbe37ad52605cae80 to your computer and use it in GitHub Desktop.
Get list of actual dependencies for a Python project

If you're requirements.txt file is the output of pip freeze then it's really a lock file, not a list of project dependencies. To get just the actual dependencies:

find . -name \*.py -type f | xargs sed -En '/^(import|from)/ {s/\S+ //; s/ .*//; s/\..*//; p}' | sort -u >reqs

This extracts foo from all import foo.bar... and from foo.bar import... lines, dropping local imports (starting with '.').

To filter out standard library entries:

# testreqs.py
"""Test names on stdin to see if they're importable.

Useful for refining requirements / filtering standard library modules.
"""
import sys
from importlib import import_module

for line in filter(None, map(str.strip, sys.stdin)):
    try:
        import_module(line)
    except ImportError:
        print(line)
python testreqs.py < reqs

Or use docker for a clean environment

docker run --rm -it -v $PWD:/data python sh -c 'python /data/testreqs.py < /data/reqs'

You may still need to cross-reference the output and your requirements.txt lock file for packages with names that don't match their module name, e.g. git comes from GitPython.

Another approach would be to use pipdeptree, but that will report things that are installed but not used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment