Skip to content

Instantly share code, notes, and snippets.

@idlesign
Created February 16, 2018 04:29
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
Star You must be signed in to star a gist
Save idlesign/541802b3b3c0eff6cdf04ea08613ffcc to your computer and use it in GitHub Desktop.
This will produce an SQL to query Google BigQuery for a certain author packages downloads monthly statistics.
"""This will produce an SQL to query Google BigQuery
for a certain author packages downloads monthly statistics.
"""
try:
import xmlrpclib
except ImportError:
import xmlrpc.client as xmlrpclib
AUTHOR = 'idlesign'
DATE_FILTER = '2018*'
SQL = '''
SELECT file.project AS project, FORMAT_TIMESTAMP("%%Y-%%m", timestamp) AS month, COUNT(*) AS downloads
FROM `the-psf.pypi.downloads%(date)s`
WHERE file.project IN (%(projects)s)
GROUP BY project, month
ORDER BY project, month
'''
rpc = xmlrpclib.ServerProxy('https://pypi.python.org/pypi')
packages = sorted((pkg for role, pkg in rpc.user_packages(AUTHOR)))
sql = SQL % {
'projects': '"%s"' % '", "'.join(packages),
'date': DATE_FILTER,
}
print(sql.replace('\n', ' ').strip())
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment