Skip to content

Instantly share code, notes, and snippets.

View vinayak-mehta's full-sized avatar
🤕
Recovering

Vinayak Mehta vinayak-mehta

🤕
Recovering
View GitHub Profile
import os
import sys
import pandas as pd
import pdfplumber
pdf = pdfplumber.open(sys.argv[1])
p0 = pdf.pages[0]
table = p0.extract_table()
print table
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@vinayak-mehta
vinayak-mehta / pdftables_extract.py
Last active September 22, 2018 11:30
A Python2 script to extract tables from a PDF file using pdftables; saves tables as CSV files inside the current working directory.
#!/usr/bin/env python
"""
Usage: python pdftables_extract.py <filename>
"""
import os
import sys
import pandas as pd
from pdftables.pdf_document import PDFDocument