Skip to content

Instantly share code, notes, and snippets.

@ferrygun
Created July 14, 2020 10:01
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ferrygun/34024153185b3ead5f82129357dd6bf1 to your computer and use it in GitHub Desktop.
Save ferrygun/34024153185b3ead5f82129357dd6bf1 to your computer and use it in GitHub Desktop.
interesting_areas=[]
output = [[x1, y1, x2, y2]]
for x in output:
[x1, y1, x2, y2] = bboxes_pdf(img, pdf_page, x)
bbox_camelot = [
",".join([str(x1), str(y1), str(x2), str(y2)])
][0] # x1,y1,x2,y2 where (x1, y1) -> left-top and (x2, y2) -> right-bottom in PDF coordinate space
#print(bbox_camelot)
interesting_areas.append(bbox_camelot)
print(interesting_areas)
output_camelot = camelot.read_pdf(
filepath=pdf_file, pages=str(pg), flavor="stream", table_areas=interesting_areas
)
output_camelot[0].df
@cy576013581
Copy link

bboxes_pdf is no definition ?

@ubaid08
Copy link

ubaid08 commented Sep 20, 2020

Can you please tell me where the function bboxes_pdf came from? Also about it's input parameters, what would be the values of pdf_page and x?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment