Skip to content

Instantly share code, notes, and snippets.

@saulshanabrook
Last active August 29, 2015 13:57
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save saulshanabrook/9477894 to your computer and use it in GitHub Desktop.
Save saulshanabrook/9477894 to your computer and use it in GitHub Desktop.
Artwork Names and Descriptions from PP

For my art history class, my teacher gave me a large PP with a slide for each artwork we needed to learn, so that I could use upload the images plus the artwork identifications and classifications to Memrise for virtual study cards.

I grew tired of copying the two lines from each slide, so I decided to try to parse the PP in python to export a list of artwork IDs and classifications. For example, here El Greco I wanted the output to be El Greco: Mannerism.

I found out that a .ppx file is really just a .zip, so once I extracted the PP it was easy to find the slide files, which were all .xml.

https://asciinema.org/a/8098

import glob
import xml.etree.ElementTree as ET
EXTRACTED_PP_DIR = '/Users/saul/Desktop/Final Exam Ids'
EXTRACTED_PP_SLIDE_DIR = EXTRACTED_PP_DIR + '/ppt/slides'
def slide_paths(slide_dir):
paths = glob.glob(slide_dir + '/*.xml')
paths.sort(key=lambda path: int(path.split('/')[-1].split('slide')[1].split('.')[0]))
return paths
def artwork_name(root):
name_tags = root.findall(".//{http://schemas.openxmlformats.org/drawingml/2006/main}rPr[@sz='2800']/../{http://schemas.openxmlformats.org/drawingml/2006/main}t")
name_tags_text = map(lambda tag: tag.text, name_tags)
name_text = ''.join(name_tags_text)
return name_text.replace(',', ' ').replace('-', ' ')
def period(root):
period_tag = root.find(".//{http://schemas.openxmlformats.org/drawingml/2006/main}rPr[@sz='2000'][@b='1']/../{http://schemas.openxmlformats.org/drawingml/2006/main}t")
return period_tag.text
for path in slide_paths(EXTRACTED_PP_SLIDE_DIR):
tree = ET.parse(path)
root = tree.getroot()
print artwork_name(root) + ': ' + period(root)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment