Skip to content

Instantly share code, notes, and snippets.

@pmlandwehr
Created April 24, 2017 20:14
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save pmlandwehr/4345b2319d3ca6714e2caf68133442a0 to your computer and use it in GitHub Desktop.
Save pmlandwehr/4345b2319d3ca6714e2caf68133442a0 to your computer and use it in GitHub Desktop.
Get basic conda-forge stats from the anaconda_cloud
from bs4 import BeautifulSoup
import numpy as np
import pandas
import requests
from tqdm import tqdm
TOTAL_CF_REPO_PAGES = 45
package_tags = list(chain.from_iterable([
BeautifulSoup(requests.get('https://anaconda.org/conda-forge/repo',
params={'page': x}).text,
'lxml').find_all('span', class_='packageName')
for x in tqdm(range(1, TOTAL_CF_REPO_PAGES))
]))
package_names = [x.contents[0] for x in package_tags]
dl_counts = []
for pkg in tqdm(package_names):
r = requests.get('https://api.anaconda.org/package/conda-forge/{}'.format(pkg))
try:
dl_counts.append(np.sum(x['ndownloads'] for x in r.json()['files']))
except:
dl_counts.append(np.nan)
df = pd.DataFrame({'pkg': package_names, 'dl_count': dl_counts})
df.sort_values('dl_count', ascending=False).head(50)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment