Skip to content

Instantly share code, notes, and snippets.

@thewtex
Forked from juanpabloaj/README.md
Last active January 19, 2018 21:33
Show Gist options
  • Save thewtex/90745d210f5ec6ee34d2cf1b6e3cf15c to your computer and use it in GitHub Desktop.
Save thewtex/90745d210f5ec6ee34d2cf1b6e3cf15c to your computer and use it in GitHub Desktop.
Total of pip packages downloaded, separated by Python versions

Total of pip packages downloaded separated by Python versions

From September 26, 2017 to November 31, 2017.

Python versions from 2.6 to 3.6

downloads by version

-- https://bigquery.cloud.google.com/dataset/the-psf:pypi
SELECT concat(
date(timestamp), '_', REGEXP_EXTRACT(details.python, r'^([2-3]).[0-9].')
) as date_python, count(details.python) as downloads
FROM (TABLE_DATE_RANGE([the-psf:pypi.downloads],
TIMESTAMP('2016-06-26'),
TIMESTAMP('2016-08-31')))
group by date_python
-- https://bigquery.cloud.google.com/dataset/the-psf:pypi
-- https://bigquery.cloud.google.com/table/the-psf:pypi.downloads20160903
SELECT concat(
date(timestamp), '_', REGEXP_EXTRACT(details.python, r'^([2-3].[0-9]).')
) as date_python, count(details.python) as downloads
FROM (TABLE_DATE_RANGE([the-psf:pypi.downloads],
TIMESTAMP('2017-09-26'),
TIMESTAMP('2017-11-31')))
group by date_python
#!/usr/bin/python
# -*- coding: utf-8 -*-
# To plot chart from csv generated by bigquery
import pandas as pd
import matplotlib.pyplot as plt
ts = pd.read_csv('download_python_version_by_day.csv')
dp = ts['date_python']
dp = dp.str.extract('(?P<date>.*?)_(?P<python>.*)', expand=True)
ts = ts.join(dp)
ts['date'] = pd.to_datetime(ts['date'])
df = ts.pivot(index='date', columns='python', values='downloads')
df[['2.6', '2.7', '3.1', '3.2', '3.3', '3.4', '3.5', '3.6']][1:].plot()
plt.draw()
plt.savefig('python_downloads_by_version_by_day.png')
plt.show()
#!/usr/bin/python
# -*- coding: utf-8 -*-
import pandas as pd
import matplotlib.pyplot as plt
ts = pd.read_csv(
'download_python_major_version_by_day.csv', parse_dates=True,
)
ts['date'] = pd.to_datetime(ts['date'])
df = ts.pivot(index='date', columns='python', values='downloads')
ax = df[[2, 3]].plot(logy=True, figsize=(12, 9))
ax.set_ylabel('log(downloads)')
ax.set_title('Python packages downloads')
plt.show()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment