Skip to content

Instantly share code, notes, and snippets.

@AndreyAkinshin
Last active August 29, 2015 14:11
Show Gist options
  • Save AndreyAkinshin/25dea1f4eb932414d916 to your computer and use it in GitHub Desktop.
Save AndreyAkinshin/25dea1f4eb932414d916 to your computer and use it in GitHub Desktop.
Template for fips bulletin downloading
# Main link: http://www1.fips.ru/wps/wcm/connect/content_ru/ru/ofic_pub/ofic_bul/ofic_bul_prevm
# pdf->txt converter: pdfminer, https://pypi.python.org/pypi/pdfminer/
import urllib, os
def get_link(year, month, number):
link = "http://www1.fips.ru/Archive/EVM/" + \
str(year) + "/" + str(year) + "." + str(month).zfill(2) + ".20" + \
"/DOC/RUNW/000/00" + str(year)[0:1] + "/" + str(year)[1:4].zfill(3) + "/" + \
str(number)[0:3].zfill(3) + "/" + str(number)[3:6].zfill(3) + "/document.pdf"
return link
def download(link, number):
url = urllib.URLopener()
url.retrieve(link, str(number) + ".pdf")
os.system("pdf2txt.py " + str(number) + ".pdf > " + str(number) + ".txt")
print "DONE: " + str(number)
def download_bulletin(year, month, fromNumber, toNumber):
for number in xrange(fromNumber, toNumber + 1):
download(get_link(year, month, number), number)
# Download bulletin example: December 2014
download_bulletin(2014, 12, 661608, 662651)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment