Skip to content

Instantly share code, notes, and snippets.

@dmtucker
Last active April 25, 2020 04:07
Show Gist options
  • Save dmtucker/bffe536fabd27b5ebf43db112a89fa6c to your computer and use it in GitHub Desktop.
Save dmtucker/bffe536fabd27b5ebf43db112a89fa6c to your computer and use it in GitHub Desktop.
What's up with Python 1.17 in the PyPI stats database?
@dmtucker
Copy link
Author

dmtucker commented Mar 4, 2018

What is 1.17 used for?

(david-YYe3LIIa) david@kahuna:~ $ pypinfo --days 365 --limit 50 --where 'details.python = "1.17"' '' project
Served from cache: False
Data processed: 138.32 GiB
Data billed: 138.32 GiB
Estimated cost: $0.68

| project              | download_count |
| -------------------- | -------------- |
| kudu-python          |         28,210 |
| kazoo                |         26,628 |
| pytest               |         24,272 |
| pytest-xdist         |         24,132 |
| py                   |         23,960 |
| cryptography         |         23,954 |
| cffi                 |         23,803 |
| impyla               |         23,698 |
| setuptools-scm       |         23,033 |
| pycparser            |         19,441 |
| scandir              |         19,173 |
| azure-datalake-store |         19,136 |
| virtualenv           |         16,072 |
| requests             |         11,126 |
| jinja2               |         10,509 |
| botocore             |          9,708 |
| paramiko             |          9,315 |
| fabric               |          8,901 |
| sh                   |          8,796 |
| ordereddict          |          8,696 |
| ecdsa                |          8,686 |
| pycrypto             |          8,658 |
| flask                |          8,639 |
| cython               |          8,620 |
| six                  |          8,543 |
| simplejson           |          8,498 |
| docutils             |          8,421 |
| boto3                |          8,398 |
| jmespath             |          8,266 |
| futures              |          8,260 |
| numpy                |          8,133 |
| pyparsing            |          8,112 |
| python-dateutil      |          8,045 |
| psutil               |          8,005 |
| markupsafe           |          7,943 |
| werkzeug             |          7,925 |
| itsdangerous         |          7,903 |
| pbr                  |          7,882 |
| ipython              |          7,881 |
| pytest-random        |          7,869 |
| pexpect              |          7,836 |
| argparse             |          7,742 |
| thrift               |          7,741 |
| sqlparse             |          7,734 |
| readline             |          7,719 |
| pg8000               |          7,713 |
| docopt               |          7,704 |
| hdfs                 |          7,701 |
| allpairs             |          7,698 |
| cm-api               |          7,697 |

@dmtucker
Copy link
Author

dmtucker commented Mar 5, 2018

How long has 1.17 been showing up?

dtucker@dtucker-wkstn:~ $ pypinfo --days 1095 --limit 36 --order download_month --where 'details.python = "1.17"' '' month
Served from cache: False
Data processed: 186.07 GiB
Data billed: 186.07 GiB
Estimated cost: $0.91

| download_month | download_count |
| -------------- | -------------- |
| 2018-03        |         16,157 |
| 2018-02        |         67,154 |
| 2018-01        |      1,898,377 |
| 2017-12        |         70,865 |
| 2017-11        |         64,577 |
| 2017-10        |         66,968 |
| 2017-09        |         63,809 |
| 2017-08        |         93,022 |
| 2017-07        |        628,639 |
| 2017-06        |        200,335 |
| 2017-05        |        148,618 |
| 2017-04        |         50,842 |
| 2017-03        |         53,755 |
| 2017-02        |         47,414 |
| 2017-01        |         50,651 |
| 2016-12        |         53,138 |
| 2016-11        |        129,537 |
| 2016-10        |        477,819 |
| 2016-09        |      1,013,797 |
| 2016-08        |        116,291 |
| 2016-07        |         81,521 |
| 2016-06        |         59,695 |
| 2016-05        |        151,872 |
| 2016-03        |          4,618 |
| 2016-02        |        226,267 |
| 2016-01        |        184,133 |

@dmtucker
Copy link
Author

dmtucker commented Mar 5, 2018

Where does 1.17 come from?

Standard urllib seems like the most likely source (which is present in at least one fork):

$ python -c 'import sys, urllib, urllib2;  print(sys.version[:3], urllib.__version__, urllib2.__version__)'; python3 -c 'import urllib.request; print(urllib.request.__version__)'
('2.7', '1.17', '2.7')
3.5

What agents may be using urllib?

@dmtucker
Copy link
Author

dmtucker commented Mar 6, 2018

Where is 1.17 being used from?

(david-YYe3LIIa) david@kahuna:~ $ pypinfo --days 365 --where 'details.python = "1.17"' '' country
Served from cache: False
Data processed: 81.80 GiB
Data billed: 81.80 GiB
Estimated cost: $0.40

| country | download_count |
| ------- | -------------- |
| US      |      3,216,292 |
| JP      |         55,160 |
| CN      |         34,118 |
| GB      |         24,925 |
| IN      |          8,064 |
| None    |          7,190 |
| FR      |          6,131 |
| DE      |          5,703 |
| IL      |          5,109 |
| ES      |          4,710 |

@dmtucker
Copy link
Author

dmtucker commented Mar 6, 2018

Notes

Switching PyPI to HTTPS-only should mean that occurrences dropped in Oct/Nov 2017 (but they didn't).

urllib.urlretrieve uses FancyURLopener which inherits from URLopener which sets the User-Agent.

easy_install's user-agent logic:
https://github.com/pypa/setuptools/blob/97ff22f31ace57f4eabb6f1e77c9c553de0d1c24/setuptools/package_index.py#L50

log parser user agent logic:
https://github.com/pypa/linehaul/blob/master/linehaul/user_agents.py


(david-YYe3LIIa) david@kahuna:~ $ pypinfo --days 365 --where 'details.python = "1.17"' '' system distro
Served from cache: False
Data processed: 146.60 GiB
Data billed: 146.60 GiB
Estimated cost: $0.72

| system_name | distro_name | download_count |
| ----------- | ----------- | -------------- |
| None        | None        |      3,405,944 |

(david-YYe3LIIa) david@kahuna:~ $ pypinfo --days 365 --where 'details.python = "1.17"' '' installer installer-version
Served from cache: False
Data processed: 153.04 GiB
Data billed: 153.04 GiB
Estimated cost: $0.75

| installer_name | installer_version | download_count |
| -------------- | ----------------- | -------------- |
| None           | None              |      3,405,944 |

(david-YYe3LIIa) david@kahuna:~ $ pypinfo --days 365 --where 'details.python = "1.17"' '' impl impl-version
Served from cache: False
Data processed: 155.91 GiB
Data billed: 155.91 GiB
Estimated cost: $0.77

| implementation | impl_version | download_count |
| -------------- | ------------ | -------------- |
| None           | None         |      3,405,944 |

(david-YYe3LIIa) david@kahuna:~ $ pypinfo --days 365 --where 'details.python = "1.17"' '' setuptools-version
Served from cache: False
Data processed: 51.03 GiB
Data billed: 51.03 GiB
Estimated cost: $0.25

| setuptools_version | download_count |
| ------------------ | -------------- |
| None               |      3,405,944 |

(david-YYe3LIIa) david@kahuna:~ $ pypinfo --days 365 --where 'details.python = "1.17"' '' openssl
Served from cache: False
Data processed: 173.99 GiB
Data billed: 173.99 GiB
Estimated cost: $0.85

| openssl_version | download_count |
| --------------- | -------------- |
| None            |      3,405,944 |

Python Version Trends

None of these seem to correspond with 1.17 trends:

(david-YYe3LIIa) david@kahuna:~ $ pypinfo --days 1095 --limit 36 --order download_month --where 'details.python = "2.6"' '' month
Served from cache: False
Data processed: 186.30 GiB
Data billed: 186.30 GiB
Estimated cost: $0.91

| download_month | download_count |
| -------------- | -------------- |
| 2018-03        |        146,529 |
| 2018-02        |        828,609 |
| 2018-01        |      1,141,411 |
| 2017-12        |      1,614,495 |
| 2017-11        |      1,371,796 |
| 2017-10        |      2,023,681 |
| 2017-09        |      2,289,735 |
| 2017-08        |      2,704,477 |
| 2017-07        |      2,828,229 |
| 2017-06        |      2,889,239 |
| 2017-05        |      2,043,001 |
| 2017-04        |      2,457,703 |
| 2017-03        |      2,680,095 |
| 2017-02        |      2,576,469 |
| 2017-01        |      3,066,075 |
| 2016-12        |      3,053,485 |
| 2016-11        |      3,591,192 |
| 2016-10        |      3,270,814 |
| 2016-09        |      3,209,299 |
| 2016-08        |      4,005,600 |
| 2016-07        |      3,946,704 |
| 2016-06        |      4,171,922 |
| 2016-05        |      2,034,229 |
| 2016-03        |      1,815,018 |
| 2016-02        |      9,519,408 |
| 2016-01        |      2,959,021 |

(david-YYe3LIIa) david@kahuna:~ $ pypinfo --days 1095 --limit 36 --order download_month --where 'details.python = "2.7"' '' month
Served from cache: False
Data processed: 186.30 GiB
Data billed: 186.30 GiB
Estimated cost: $0.91

| download_month | download_count |
| -------------- | -------------- |
| 2018-03        |      4,668,812 |
| 2018-02        |     22,073,881 |
| 2018-01        |     30,490,386 |
| 2017-12        |     31,818,101 |
| 2017-11        |     30,530,294 |
| 2017-10        |     42,417,489 |
| 2017-09        |     47,877,172 |
| 2017-08        |     61,447,951 |
| 2017-07        |     63,732,403 |
| 2017-06        |     66,186,931 |
| 2017-05        |     68,284,527 |
| 2017-04        |     71,634,346 |
| 2017-03        |     75,668,717 |
| 2017-02        |     72,209,408 |
| 2017-01        |     80,589,481 |
| 2016-12        |     71,611,590 |
| 2016-11        |     78,672,198 |
| 2016-10        |     68,610,169 |
| 2016-09        |     74,346,875 |
| 2016-08        |     72,597,214 |
| 2016-07        |     62,769,844 |
| 2016-06        |     50,753,857 |
| 2016-05        |     17,155,656 |
| 2016-03        |      9,538,368 |
| 2016-02        |     48,248,300 |
| 2016-01        |     15,114,515 |

(david-YYe3LIIa) david@kahuna:~ $ pypinfo --days 1095 --limit 36 --order download_month --where 'details.python = "3.4"' '' month
Served from cache: False
Data processed: 186.30 GiB
Data billed: 186.30 GiB
Estimated cost: $0.91

| download_month | download_count |
| -------------- | -------------- |
| 2018-03        |        453,018 |
| 2018-02        |      1,966,827 |
| 2018-01        |      2,435,945 |
| 2017-12        |      2,517,948 |
| 2017-11        |      2,310,739 |
| 2017-10        |      2,491,198 |
| 2017-09        |      2,332,760 |
| 2017-08        |      2,697,636 |
| 2017-07        |      2,397,228 |
| 2017-06        |      1,834,341 |
| 2017-05        |      1,863,789 |
| 2017-04        |      1,848,761 |
| 2017-03        |      2,238,398 |
| 2017-02        |      1,794,172 |
| 2017-01        |      1,793,060 |
| 2016-12        |      1,519,826 |
| 2016-11        |      1,539,866 |
| 2016-10        |      1,660,522 |
| 2016-09        |      1,602,485 |
| 2016-08        |      1,999,833 |
| 2016-07        |      1,976,705 |
| 2016-06        |        792,186 |
| 2016-05        |        363,550 |
| 2016-03        |        188,615 |
| 2016-02        |        977,270 |
| 2016-01        |        258,988 |

(david-YYe3LIIa) david@kahuna:~ $ pypinfo --days 1095 --limit 36 --order download_month --where 'details.python = "3.5"' '' month
Served from cache: False
Data processed: 186.30 GiB
Data billed: 186.30 GiB
Estimated cost: $0.91

| download_month | download_count |
| -------------- | -------------- |
| 2018-03        |        419,007 |
| 2018-02        |      2,124,505 |
| 2018-01        |      2,331,339 |
| 2017-12        |      2,649,399 |
| 2017-11        |      2,759,223 |
| 2017-10        |      2,312,185 |
| 2017-09        |      2,016,890 |
| 2017-08        |      2,438,073 |
| 2017-07        |      2,391,372 |
| 2017-06        |      3,384,902 |
| 2017-05        |      2,832,727 |
| 2017-04        |      3,037,491 |
| 2017-03        |      2,466,478 |
| 2017-02        |      2,367,826 |
| 2017-01        |      2,417,595 |
| 2016-12        |      2,464,993 |
| 2016-11        |      1,820,581 |
| 2016-10        |      2,006,853 |
| 2016-09        |      1,761,309 |
| 2016-08        |      1,242,226 |
| 2016-07        |      1,004,178 |
| 2016-06        |        603,266 |
| 2016-05        |        333,189 |
| 2016-03        |         89,432 |
| 2016-02        |        489,781 |
| 2016-01        |        105,938 |

(david-YYe3LIIa) david@kahuna:~ $ pypinfo --days 1095 --limit 36 --order download_month --where 'details.python = "3.6"' '' month
Served from cache: False
Data processed: 186.30 GiB
Data billed: 186.30 GiB
Estimated cost: $0.91

| download_month | download_count |
| -------------- | -------------- |
| 2018-03        |        460,217 |
| 2018-02        |      2,271,910 |
| 2018-01        |      2,236,416 |
| 2017-12        |      2,251,264 |
| 2017-11        |      5,606,128 |
| 2017-10        |      2,226,206 |
| 2017-09        |      2,054,285 |
| 2017-08        |      2,339,605 |
| 2017-07        |      1,661,571 |
| 2017-06        |      1,726,659 |
| 2017-05        |      1,152,307 |
| 2017-04        |        902,785 |
| 2017-03        |        792,730 |
| 2017-02        |        580,227 |
| 2017-01        |        391,527 |
| 2016-12        |         71,144 |
| 2016-11        |         26,199 |
| 2016-10        |         23,336 |
| 2016-09        |         23,020 |
| 2016-08        |         23,118 |
| 2016-07        |         20,472 |
| 2016-06        |         10,369 |
| 2016-05        |          3,076 |
| 2016-03        |          1,319 |
| 2016-02        |          7,425 |
| 2016-01        |          1,597 |

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment