Skip to content

Instantly share code, notes, and snippets.

🐢
Getting back on track

Vinayak Mehta vinayak-mehta

🐢
Getting back on track
Block or report user

Report or block vinayak-mehta

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
@vinayak-mehta
vinayak-mehta / pdftables_extract.py
Last active Sep 22, 2018
A Python2 script to extract tables from a PDF file using pdftables; saves tables as CSV files inside the current working directory.
View pdftables_extract.py
#!/usr/bin/env python
"""
Usage: python pdftables_extract.py <filename>
"""
import os
import sys
import pandas as pd
from pdftables.pdf_document import PDFDocument
@vinayak-mehta
vinayak-mehta / pdf_table_extract.py
Created Sep 22, 2018
A Python2 script to extract tables from a PDF file using pdf-table-extract; saves tables as CSV files inside the current working directory.
View pdf_table_extract.py
#!/usr/bin/env python
"""
Usage: python pdf_table_extract.py <filename>
"""
import os
import sys
import pandas as pd
import pdftableextract as pdf
@vinayak-mehta
vinayak-mehta / disease_outbreaks_camelot.ipynb
Last active May 20, 2020
A jupyter notebook showing how Camelot can be used to extract tables from PDFs scraped from the IDSP website.
View disease_outbreaks_camelot.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View hn-comments.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View pdfplumber_extract.py
import os
import sys
import pandas as pd
import pdfplumber
pdf = pdfplumber.open(sys.argv[1])
p0 = pdf.pages[0]
table = p0.extract_table()
print table
You can’t perform that action at this time.