Skip to content

Instantly share code, notes, and snippets.

@5teven1in
Last active June 8, 2021 06:55
Show Gist options
  • Save 5teven1in/d118f8d67880fd78c7b7f5e61de5d246 to your computer and use it in GitHub Desktop.
Save 5teven1in/d118f8d67880fd78c7b7f5e61de5d246 to your computer and use it in GitHub Desktop.
chocolatey package list crawler
import sys
import requests
from bs4 import BeautifulSoup as BS
url_format = "https://chocolatey.org/packages?sortOrder=package-download-count&page={}&prerelease=False&moderatorQueue=False&moderationStatus=all-statuses"
def get_list(idx):
res = requests.get(url_format.format(idx))
soup = BS(res.text, 'html.parser')
lis = soup.find_all('div', class_="package-list-align")
for item in lis:
out = item.find("input")
try:
print(out.attrs["value"])
except:
pass
for i in range(1, int(sys.argv[1]) + 1):
get_list(i)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment