Skip to content

Instantly share code, notes, and snippets.

@treethought
Last active May 22, 2018 14:20
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save treethought/7384eab11828bb148c6e12baffdd1873 to your computer and use it in GitHub Desktop.
Save treethought/7384eab11828bb148c6e12baffdd1873 to your computer and use it in GitHub Desktop.
pipfetch
import os
import sys
import re
import click
import subprocess
def get_py_files(dir):
for dirname, dirnames, filenames in os.walk(dir):
# path to all subdirectories first.
for subdirname in dirnames:
get_py_files(subdirname)
pass
# print path to all filenames.
for filename in filenames:
if os.path.splitext(filename)[1] == '.py':
# print('Checking {}'.format(filename))
yield os.path.join(dirname, filename)
# Advanced usage:
# editing the 'dirnames' list will stop os.walk() from recursing into there.
if '.git' in dirnames:
# don't go into any .git directories.
dirnames.remove('.git')
if 'tests' in dirnames:
# don't go into any .git directories.
dirnames.remove('tests')
def scan_file(path):
with open(path, 'r') as f:
for line in [l.rstrip() for l in f]:
for pkg in parse_line(line):
yield pkg
def parse_simple_import(line):
names = line.split('import ')[1]
name_list = names.split(', ')
for pkg in name_list:
first_string = re.findall('\w+', pkg)[0] # in case of ";"
package = first_string.split('.')[0]
yield package
def parse_from_statement(line):
top_import = re.findall('\S*\s+import', line)[0]
package = top_import.rstrip(' import')
if package.startswith('.'):
# relative import
return
elif '.' in package:
package = package.split('.')[0]
return package
def parse_line(line):
packages = []
try:
if line.startswith('import'):
for p in parse_simple_import(line):
yield p
if line.startswith('from'):
p = parse_from_statement(line)
yield p
except Exception as e:
print('SKipping line: {}'.format(line))
@click.command('scan')
@click.argument('dirpath')
def scan_dir(dirpath):
packages = set()
for f in get_py_files(dirpath):
for p in scan_file(f):
packages.add(p)
with open('requirements-fetch.txt', 'w') as f:
for p in packages:
f.write(str(p) + '\n')
if __name__ == '__main__':
scan_dir()
@slavakurilyak
Copy link

Consider the following features to turn this into an open-source project:

Launch of pipenv-scan Package

As a developer, I want to convert the existing module into a python package, so that other developers can clone any github repository and trigger pipenv-scan to find necessary packages, then trigger pipenv to install necessary packages.

As a developer, I want to run the following command line: pipenv-scan, so that I can recursively scan a folder for all mentions of module imports.

Rule-Based Parsing for Packages

As a developer, I want to create an ignore list, so that pipenv-scan can avoid installing python libraries (os, logging, warnings, glob,) and improper libraries (pdfminer).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment