Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Analysis of PyPi package names and the use of dashes underscores upper and lower case
try:
import xmlrpclib
except ImportError:
import xmlrpc.client as xmlrpclib
client = xmlrpclib.ServerProxy('https://pypi.python.org/pypi')
packages = client.list_packages()
total = len(packages)
dashes = len([x for x in packages if '-' in x])
underscores = len([x for x in packages if '_' in x])
both = len([x for x in packages if '_' in x and '-' in x])
neither = len([x for x in packages if '_' not in x and '-' not in x])
alllower = len([x for x in packages if x == x.lower()])
allupper = len([x for x in packages if x == x.upper()])
mixed = len([x for x in packages if x != x.upper() and x != x.lower()])
print("Total packages : {}".format(total))
print("Packages with dashes : {} ({:.2%})".format(dashes, float(dashes) / total))
print("Packages with underscores : {} ({:.2%})".format(underscores, float(underscores) / total))
print("Packages with both dashes and underscores : {} ({:.2%})".format(both, float(both) / total))
print("Packages with both dashes and underscores : {} ({:.2%})".format(both, float(both) / total))
print("Packages with all lowercase characters : {} ({:.2%})".format(alllower, float(alllower) / total))
print("Packages with all uppercase characters : {} ({:.2%})".format(allupper, float(allupper) / total))
print("Packages with both lower and upper case characters : {} ({:.2%})".format(mixed, float(mixed) / total))
Total packages : 87680
Packages with dashes : 23370 (26.65%)
Packages with underscores : 7750 (8.84%)
Packages with both dashes and underscores : 133 (0.15%)
Packages with both dashes and underscores : 133 (0.15%)
Packages with all lowercase characters : 76036 (86.72%)
Packages with all uppercase characters : 439 (0.50%)
Packages with both lower and upper case characters : 11213 (12.79%)
@gene1wood
Copy link
Author

gene1wood commented Aug 31, 2016

Package names should have dashes not underscores.
Module names must not have dashes and can have underscores.
https://www.python.org/dev/peps/pep-0008/#package-and-module-names

@Dominik1123
Copy link

Dominik1123 commented Apr 27, 2017

By adding

print("No letters at all: ", set([x for x in packages if x == x.lower()]) & set([x for x in packages if x == x.upper()]))

you'll find the following beauties

{'1337', '42', '5', '3-1', '1', '2', '2112', '0-._.-._.-._.-._.-._.-._.-0'}

amongst which 0-._.-._.-._.-._.-._.-._.-0 is definitely my favorite :) Also the "2112" package is worth having a look! Strange enough the "2" package seems to be unavailable although it was apparently listed by pypi. Any ideas?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment