Skip to content

Instantly share code, notes, and snippets.

@scionoftech
Last active June 14, 2019 07:28
Show Gist options
  • Save scionoftech/961d9692c1dfc8b6104f5775795bc4e0 to your computer and use it in GitHub Desktop.
Save scionoftech/961d9692c1dfc8b6104f5775795bc4e0 to your computer and use it in GitHub Desktop.
this is a python script for reading pdf forms
import json
from pdfminer.pdfparser import PDFParser
from pdfminer.pdfdocument import PDFDocument
from pdfminer.pdftypes import resolve1
fp = open(PDF_FILE_PATH, 'rb')
parser = PDFParser(fp)
doc = PDFDocument(parser)
print(doc)
fields = resolve1(doc.catalog['AcroForm'])['Fields']
data = []
for i in fields:
field = resolve1(i)
dd = {'field':str(field.get('T')),'value':str(field.get('V'))}
data.append(dd)
#name, value = field.get('T'), field.get('V')
#print('{0}: {1}'.format(name, value))
print(json.dumps(data))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment