Skip to content

Instantly share code, notes, and snippets.

@goog
Created October 25, 2012 06:09
Show Gist options
  • Save goog/3950793 to your computer and use it in GitHub Desktop.
Save goog/3950793 to your computer and use it in GitHub Desktop.
import tika
tika.initVM()
from tika import parser
print parser.from_buffer("<html><body>Hello World</body></html>")
print parser.from_file("test.pdf")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment