Skip to content

Instantly share code, notes, and snippets.

@gtfierro
Created February 25, 2014 23:17
Show Gist options
  • Save gtfierro/9220108 to your computer and use it in GitHub Desktop.
Save gtfierro/9220108 to your computer and use it in GitHub Desktop.
Download PTAB
import requests
from bs4 import BeautifulSoup
url = 'http://e-foia.uspto.gov/Foia/DispatchBPAIServlet?RetrieveRecent=30'
html = requests.get(url).content
soup = BeautifulSoup(html)
all_download_links = soup.findAll('a', {'target': '_self'})
for i, link in enumerate(all_download_links):
download = 'http://e-foia.uspto.gov/Foia/'+link['href']
pdfhex = requests.get(download).content
with open('ptab-{0}.pdf'.format(i),'w') as f:
f.write(pdfhex)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment