Skip to content

Instantly share code, notes, and snippets.

@Serpens
Created October 20, 2019 20:38
Show Gist options
  • Save Serpens/a355a53f1efb039bdfb9e65cf6ad85b6 to your computer and use it in GitHub Desktop.
Save Serpens/a355a53f1efb039bdfb9e65cf6ad85b6 to your computer and use it in GitHub Desktop.
Get table from a PDF and save it as CSV
#!/usr/bin/env python3
import os
import sys
import re
import pandas as pd
from tabula import read_pdf
if __name__ == '__main__':
pdf_name = sys.argv[1]
if len(sys.argv) > 2:
csv_name = sys.argv[2]
else:
if pdf_name.endswith('.pdf'):
csv_name = re.sub('.pdf$', '.csv', pdf_name)
else:
csv_name = pdf_name + '.csv'
data = read_pdf(pdf_name)
data.to_csv(csv_name)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment