Skip to content

Instantly share code, notes, and snippets.

@gene1wood
Last active June 12, 2023 02:24
Show Gist options
  • Star 10 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save gene1wood/9472a9d0dffce1a56d6e796afc6539b8 to your computer and use it in GitHub Desktop.
Save gene1wood/9472a9d0dffce1a56d6e796afc6539b8 to your computer and use it in GitHub Desktop.
Analysis of PyPi package names and the use of dashes underscores upper and lower case
try:
import xmlrpclib
except ImportError:
import xmlrpc.client as xmlrpclib
client = xmlrpclib.ServerProxy('https://pypi.python.org/pypi')
packages = client.list_packages()
total = len(packages)
dashes = len([x for x in packages if '-' in x])
underscores = len([x for x in packages if '_' in x])
both = len([x for x in packages if '_' in x and '-' in x])
neither = len([x for x in packages if '_' not in x and '-' not in x])
alllower = len([x for x in packages if x == x.lower()])
allupper = len([x for x in packages if x == x.upper()])
mixed = len([x for x in packages if x != x.upper() and x != x.lower()])
print("Total packages : {}".format(total))
print("Packages with dashes : {} ({:.2%})".format(dashes, float(dashes) / total))
print("Packages with underscores : {} ({:.2%})".format(underscores, float(underscores) / total))
print("Packages with both dashes and underscores : {} ({:.2%})".format(both, float(both) / total))
print("Packages with both dashes and underscores : {} ({:.2%})".format(both, float(both) / total))
print("Packages with all lowercase characters : {} ({:.2%})".format(alllower, float(alllower) / total))
print("Packages with all uppercase characters : {} ({:.2%})".format(allupper, float(allupper) / total))
print("Packages with both lower and upper case characters : {} ({:.2%})".format(mixed, float(mixed) / total))
Total packages : 87680
Packages with dashes : 23370 (26.65%)
Packages with underscores : 7750 (8.84%)
Packages with both dashes and underscores : 133 (0.15%)
Packages with both dashes and underscores : 133 (0.15%)
Packages with all lowercase characters : 76036 (86.72%)
Packages with all uppercase characters : 439 (0.50%)
Packages with both lower and upper case characters : 11213 (12.79%)
@gene1wood
Copy link
Author

Package names should have dashes not underscores.
Module names must not have dashes and can have underscores.
https://www.python.org/dev/peps/pep-0008/#package-and-module-names

@Dominik1123
Copy link

Dominik1123 commented Apr 27, 2017

By adding

print("No letters at all: ", set([x for x in packages if x == x.lower()]) & set([x for x in packages if x == x.upper()]))

you'll find the following beauties

{'1337', '42', '5', '3-1', '1', '2', '2112', '0-._.-._.-._.-._.-._.-._.-0'}

amongst which 0-._.-._.-._.-._.-._.-._.-0 is definitely my favorite :) Also the "2112" package is worth having a look! Strange enough the "2" package seems to be unavailable although it was apparently listed by pypi. Any ideas?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment