Last active
September 16, 2021 23:45
-
-
Save sveetch/20993a397dac0d6355a35b07568c5279 to your computer and use it in GitHub Desktop.
A script to extract installed packages versions from a Buildout project
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python2 | |
# -*- coding: utf-8 -*- | |
""" | |
A script to extract installed packages versions from a Buildout project. | |
Install | |
******* | |
This is a "one man army" script without specific requirements. Just drop the file in | |
your project and use it. | |
However, some environment don't have a "setuptools" installed globally (either the | |
system or pyenv), then it will require to be done. We recommend the last compatible | |
version: :: | |
pip install 'setuptools==44.0.0' | |
Require | |
******* | |
* Python2.7 (not Python3 compatible but it should not affect working on a Python3 | |
buildout project); | |
* setuptools>=7.0,<45 (more recent setuptools may change Distribute behaviors); | |
Usage | |
***** | |
For help, just execute this script with a Python2 interpreter like this: :: | |
python dr_eggs.py -h | |
Or export into a JSON file: :: | |
python dr_eggs.py --format=json > eggs.json | |
Or with specific filepath to the django script: :: | |
python dr_eggs.py --format=json --script ../../foo/bin/django-instance > eggs.json | |
The long explanation you want to scroll out | |
******************************************* | |
Why | |
--- | |
Because we have a lot of Buildout projects to maintain and since they stand on Python2 | |
and old libraries, a lot of incompatible package install fail on updates. This leads to | |
long search among libraries to find the compatible ones. | |
Also, we need a "drop & use" solution to avoid adding new configuration or dependancies | |
to these old project. So other solutions like Buildout plugins or third party libraries | |
are not desired. | |
What | |
---- | |
A simple script which stand on some assertions on our Buildout projects (configuration, | |
structure, etc..) to take some shortcut and directly run to our goal. | |
It parses the Python "binary" script builded by Buildout to launch Django. This script | |
always include the whole list of installed eggs and can be believed almost blindly. | |
The script has been developed for Python2 to be used in our old environments without | |
any problem. | |
The parser is pretty naive and won't like specific syntax out the ones we known, also it | |
assumes the first variable (at top level) matching is the right one, no smart selection | |
here. | |
Todo | |
---- | |
* Be more safe when given filepath script is invalid (not Python or whatever syntax) | |
with a relevant output message; | |
* Be more safe when given filepath script does not exists with a relevant output | |
message; | |
* Add a new filepath argument for develop-dir path where to search for developed package | |
and store it with a special flag. I don't know if we may instead detect them from | |
collected eggs; | |
""" | |
__version__ = "0.2.0" | |
import datetime | |
import json | |
import os | |
import ast | |
from pkg_resources import Distribution | |
class PythonScriptParser(object): | |
""" | |
A naive Python script parser with ``ast`` module. | |
Naive because only very few set of abstract syntax is implemented, only for | |
the thing we are looking for. | |
This perform some syntax representation from methods, since ``ast`` does not provide | |
them. We use the representations to get name and value for assignments and find the | |
one we are looking for. | |
On syntax representation methods almost everything that is not implemented | |
will raise a ``NotImplementedError`` exception so your code can catch it to ignore | |
code that do not match the relevant subject. | |
The subject is to find installed package from Python "binary" script from a buildout | |
project. Usually, this script contains something like a ``sys.path[0:0]`` assignment | |
which contains a list of package path (to eggs). | |
""" | |
def represent_attribute(self, node): | |
""" | |
Return string content of an ``ast.Attribute``. | |
Arguments: | |
node (ast.Attribute): The node to inspect. | |
Raises: | |
NotImplementedError: For every syntax part we don't support. | |
Returns: | |
string: attribute name | |
""" | |
content = node.value.id | |
if hasattr(node, "attr"): | |
content = content + "." + node.attr | |
return content | |
def represent_slice(self, node): | |
""" | |
Return content of an ``ast.Slice``. | |
Arguments: | |
node (ast.Slice): The node to inspect. | |
Raises: | |
NotImplementedError: For every syntax part we don't support. | |
Returns: | |
string: slice content surrounded by brackets | |
""" | |
content = "" | |
if getattr(node, "lower") is not None: | |
content = str(node.lower.n) | |
content = content + ":" | |
if getattr(node, "upper") is not None: | |
content = content + str(node.upper.n) | |
if getattr(node, "step") is not None: | |
raise NotImplementedError( | |
"Only 'lower' and 'upper' are implemented for ast.Slice, not 'step'" | |
) | |
return "[" + content + "]" | |
def represent_substring(self, node): | |
""" | |
Return content of an ``ast.Subscript``. | |
Arguments: | |
node (ast.Subscript): The node to inspect. | |
Raises: | |
NotImplementedError: For every syntax part we don't support. | |
Returns: | |
string: substring content | |
""" | |
content = "" | |
content = content + self.represent_attribute(node.value) | |
if isinstance(node.slice, ast.Slice): | |
content = content + self.represent_slice(node.slice) | |
else: | |
raise NotImplementedError( | |
"Only ast.Slice is implemented for subscript, not ast.Index or other" | |
) | |
return content | |
def represent_str(self, node): | |
""" | |
Return string content of an ``ast.Str``. | |
Arguments: | |
node (ast.Str): The node to inspect. | |
Raises: | |
NotImplementedError: For every syntax part we don't support. | |
Returns: | |
string: string content | |
""" | |
return node.s | |
def represent_list(self, node): | |
""" | |
Return items of an ``ast.List``. | |
Arguments: | |
node (ast.List): The node to inspect. | |
Raises: | |
NotImplementedError: For every syntax part we don't support. | |
Returns: | |
list: items | |
""" | |
content = [] | |
for child in node.elts: | |
if isinstance(child, ast.Str): | |
content.append( | |
self.represent_str(child) | |
) | |
else: | |
raise NotImplementedError( | |
"Only ast.Str is implemented for ast.List.elts items" | |
) | |
return content | |
def represent_assign(self, node): | |
""" | |
Return string content of an ``ast.Assign`` (variable assignment). | |
Arguments: | |
node (ast.Assign): The node to inspect. | |
Raises: | |
NotImplementedError: For every syntax part we don't support. | |
Returns: | |
tuple: The variable name and its value. The name is a string, the value | |
may be anything, but actually only a list or a string is implemented. | |
""" | |
variable_name = "" | |
content = None | |
if getattr(node, "targets") is not None: | |
# There may not be multiple targets, no ? | |
variable_name = variable_name + self.represent_substring(node.targets[0]) | |
else: | |
raise NotImplementedError( | |
"Only ast.Assign with targets attribute is implemented" | |
) | |
if getattr(node, "value") is not None: | |
if isinstance(node.value, ast.Str): | |
content = self.represent_str(node.value) | |
elif isinstance(node.value, ast.List): | |
content = self.represent_list(node.value) | |
else: | |
raise NotImplementedError( | |
"Only ast.Str and ast.List are implemented for ast.Assign.value" | |
) | |
else: | |
raise NotImplementedError( | |
"Only ast.Assign with value attribute is implemented" | |
) | |
return variable_name, content | |
def seek_for_packages(self, tree, pattern): | |
""" | |
Search only for variable with a list which have the variable name as given in | |
``pattern`` argument. | |
Arguments: | |
tree (ast.Node): The node tree to inspect. | |
pattern (string): The assignment variable to search. | |
Returns: | |
tuple: Variable name and value for the searched variable name pattern. Will | |
return ``None`` if no variable have been matched for given name pattern. | |
""" | |
for topnode in tree.body: | |
if isinstance(topnode, ast.Assign): | |
try: | |
name, content = self.represent_assign(topnode) | |
except NotImplementedError: | |
pass | |
else: | |
if name == pattern: | |
return name, content | |
return None | |
class BuildoutPackagesCollector(object): | |
""" | |
Scan a Buildout project and collector every installed packages. | |
Keyword Arguments: | |
sort (boolean): Select if found packages are sorted by their path or not. | |
Default is True. | |
Attributes: | |
SCRIPT_FILEPATH (string): Default script filepath to scan. | |
SCRIPT_PACKAGESET_PATTERN (string): Default variable name pattern to search to | |
get the package list. | |
""" | |
SCRIPT_FILEPATH = "bin/django-instance" | |
SCRIPT_PACKAGESET_PATTERN = "sys.path[0:0]" | |
def __init__(self, sort=True): | |
self.sort = sort | |
self.registry = [] | |
def store_package(self, distrib): | |
""" | |
Return collected package informations from given distribution. | |
Arguments: | |
distrib (pkg_resources.Distribution): A Distribution object. | |
Returns: | |
dict: Package informations (name, version, requirements, egg name). | |
""" | |
self.registry.append({ | |
"name": distrib.project_name, | |
"version": distrib.version, | |
"requires": [str(item) for item in distrib.requires()], | |
"egg": distrib.egg_name(), | |
"develop": False, | |
}) | |
def parse_script_source(self, script_filepath, pattern): | |
""" | |
Parse script to get package list | |
Arguments: | |
script_filepath (string): File path to the script to scan. | |
Returns: | |
list: A list of packages found in scanned script. | |
""" | |
fp = open(script_filepath, "r") | |
script_content = fp.read().encode("utf-8") | |
fp.close() | |
parser = PythonScriptParser() | |
tree = ast.parse(script_content) | |
found = parser.seek_for_packages(tree, pattern) | |
# When the parser did not match any variable | |
if not found: | |
return | |
varname, varcontent = found | |
return varcontent | |
def as_requirements_file(self, registry): | |
""" | |
Return a requirements.txt file from exact installed egg versions. | |
Arguments: | |
registry (list): List of packages informations as returned from scan | |
method. | |
Returns: | |
string: Package exact versions. | |
""" | |
today = datetime.date.today() | |
lines = ["# Freezed versions on {}".format(today)] | |
develop_mention = "# Versions from develop eggs or installed packages" | |
develop_lines = [] | |
# First standard eggs | |
for item in registry: | |
if not item["develop"]: | |
lines.append( | |
"{name}=={version}".format( | |
name=item["name"], | |
version=item["version"], | |
) | |
) | |
# Then develop eggs and libraries (from environment site-packages) | |
for item in registry: | |
if item["develop"]: | |
develop_lines.append( | |
"{name}=={version}".format( | |
name=item["name"], | |
version=item["version"], | |
) | |
) | |
# Append develop mention with a divider white space | |
if len(develop_lines) > 0: | |
develop_lines = ["", develop_mention] + develop_lines | |
return "\n".join(lines + develop_lines) | |
def as_buildout_version(self, registry): | |
""" | |
Return a version.cfg file from exact installed egg versions. | |
This is just a wrapper around ``as_requirements_file`` method to replace | |
requirements file syntax with the buildout version file. | |
Arguments: | |
registry (list): List of packages informations as returned from scan | |
method. | |
Returns: | |
string: Package exact versions. | |
""" | |
return self.as_requirements_file(registry).replace("==", " = ") | |
def scan(self, script_filepath=None, pattern=None): | |
""" | |
Main method to scan project requirements | |
Keyword Arguments: | |
script_filepath (string): File path to the script to scan. Default to | |
``BuildoutPackagesCollector.SCRIPT_FILEPATH`` value. | |
pattern (string): Variable name pattern to search to get the package list. | |
Default to ``BuildoutPackagesCollector.SCRIPT_PACKAGESET_PATTERN`` | |
value. | |
""" | |
script_filepath = script_filepath or self.SCRIPT_FILEPATH | |
pattern = pattern or self.SCRIPT_PACKAGESET_PATTERN | |
# Get packages from parsed script source | |
packages = self.parse_script_source(script_filepath, pattern) | |
if self.sort: | |
packages = sorted(packages) | |
# Collect informations about packages | |
for pkg in packages: | |
if pkg.endswith(".egg"): | |
distrib = Distribution.from_filename(pkg) | |
self.store_package(distrib) | |
else: | |
# This can be either a developed package, the django project directory | |
# or the Python site-packages dir. Only the develop package would | |
# interest us but it need to be correctly detected and flagged as so | |
# when collected. | |
pass | |
return packages | |
def output(self, format="json"): | |
""" | |
Output registry to a convenient format for export. | |
This directly use content from ``BuildoutPackagesCollector.registry`` attribute. | |
Keyword Arguments: | |
format (string): Either "json", "requirements" or "buildout". | |
* "json" : Output the full registry as JSON. | |
* "pip": A "requirements.txt" file. | |
* "buildout": A Buildout versions file. | |
Both JSON and Buildout format divide standard eggs from develop | |
eggs. | |
Returns: | |
string: Output format vary depending given format argument. | |
""" | |
if format == "pip": | |
return self.as_requirements_file(self.registry) | |
elif format == "buildout": | |
return self.as_buildout_version(self.registry) | |
return json.dumps(self.registry, indent=4) | |
if __name__ == "__main__": | |
import argparse | |
parser = argparse.ArgumentParser( | |
description=( | |
"Scan and output versions from a Buildout project installed packages." | |
), | |
) | |
parser.add_argument( | |
"--script", | |
default=None, | |
help=( | |
"Give a custom filepath for script source with package list. " | |
"Default to '%s'." | |
) % BuildoutPackagesCollector.SCRIPT_FILEPATH | |
) | |
parser.add_argument( | |
"--pattern", | |
default=None, | |
help=( | |
"Variable name pattern to search for variable with the package list. " | |
"(Remember to quote it and escape possible special characters) " | |
"Default to '%s'." | |
) % BuildoutPackagesCollector.SCRIPT_PACKAGESET_PATTERN | |
) | |
parser.add_argument( | |
"--format", | |
choices=["json", "pip", "buildout"], | |
default="json", | |
help="Choose the format you want to export. Default to 'json'." | |
) | |
parser.add_argument( | |
"--no-develop", | |
action="store_true", | |
help="Disable scanning develop eggs and libraries.", | |
) | |
args = parser.parse_args() | |
b = BuildoutPackagesCollector() | |
found = b.scan( | |
script_filepath=args.script, | |
) | |
# Check results | |
if not found: | |
print "/!\ No match for given script." | |
else: | |
print b.output(args.format) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment