Skip to content

Instantly share code, notes, and snippets.

@joshkitt
Created March 14, 2013 20:18
Show Gist options
  • Save joshkitt/5164842 to your computer and use it in GitHub Desktop.
Save joshkitt/5164842 to your computer and use it in GitHub Desktop.
PDFBox - extract images
File file = new File("/tmp/pdf.pdf");
PDDocument doc = PdfTestUtils.getDocument(file);
PDDocumentCatalog catalog = doc.getDocumentCatalog();
PDPage page = (PDPage) catalog.getAllPages().get(0);
PDResources resources = page.getResources();
Map images = resources.getImages();
int i = 0;
for (Object o : images.keySet()) {
String key = (String) o;
PDXObjectImage image = (PDXObjectImage) images.get(key);
image.write2file(new File("/tmp", i++ + "." + image.getSuffix()));
}
doc.close();
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment