Skip to content

Instantly share code, notes, and snippets.

@pont-us
Last active August 16, 2019 13:14
Show Gist options
  • Save pont-us/74046b3412a3bfb01c4be04f01752cb0 to your computer and use it in GitHub Desktop.
Save pont-us/74046b3412a3bfb01c4be04f01752cb0 to your computer and use it in GitHub Desktop.
Extract and print technology tags from StackOverflow company page
#!/usr/bin/env python3
"""Extract and print technology tags from StackOverflow company page.
Given the final part of a StackOverflow company URL, extract the contents
from the technology stack links onthe corresponding company page and write
them to stdout, separated by spaces.
e.g. supplying "suse" as an argument will print the technologies listed
at https://stackoverflow.com/jobs/companies/suse .
By Pontus Lurcock, 2019. Released to the public domain.
"""
from bs4 import BeautifulSoup
import argparse
import urllib3
def main():
parser = argparse.ArgumentParser()
parser.add_argument("company_name")
args = parser.parse_args()
pool_manager = urllib3.PoolManager()
response = pool_manager.request("GET",
"https://stackoverflow.com/jobs/companies/" + args.company_name)
content = response.data
soup = BeautifulSoup(content, "html.parser")
parent = soup.find("div", class_="mb16")
elements = parent.find_all("a")
print(" ".join([e.string for e in elements]))
if __name__ == "__main__":
main()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment