Skip to content

Instantly share code, notes, and snippets.

@benosteen
Created March 10, 2017 13:06
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save benosteen/4a9527ded8b77e33bdd9ab9d2cad25f8 to your computer and use it in GitHub Desktop.
Save benosteen/4a9527ded8b77e33bdd9ab9d2cad25f8 to your computer and use it in GitHub Desktop.
# 'doc' is ALTO XML, parsed into an etree xml obj.
def get_illustration_coords(doc, component="PrintSpace"):
page = doc.find("Layout/Page")
illustrations = doc.findall('Layout/Page/{0}/ComposedBlock[@TYPE="Illustration"]/GraphicalElement'.format(component))
pageh, pagew = int(page.attrib['HEIGHT']), int(page.attrib['WIDTH'])
images = []
for img in illustrations:
x,y = map(int, [img.attrib['HPOS'], img.attrib['VPOS']])
h,w = map(int, [img.attrib['HEIGHT'], img.attrib['WIDTH']])
images.append([x,y,w,h])
return (pagew, pageh), images
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment