Created
June 18, 2013 12:59
-
-
Save Tingenek/5805141 to your computer and use it in GitHub Desktop.
How to use tika...
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1) Download tika-app-1.3.jar from here http://www.apache.org/dyn/closer.cgi/tika/tika-app-1.3.jar | |
2) Move it from Downloads to somewhere more useful. I created a folder called work. | |
3) The tika jar is actually an application. So, you can now either | |
a) Double-click on it to start it up. You'll get a little ui, and drag a file into the window. | |
You can use the view menu to switch type of output, text, xml etc | |
b) Use it as part of a program. Open a command-prompt (Applications,Utilities,Terminal) | |
cd to the folder where you put the jar | |
type java -jar tika-app-1.3.jar -x [filename] where [filename] is the file you want to read. | |
-x gives you xml, -t text etc. | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment