Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Use Apache Tika to extract PDF content in SearchWP
<?php
// Use Apache Tika to extract PDF content in SearchWP.
add_filter( 'searchwp\parser\pdf', function( $content, $args ) {
// Ensure this path is updated to match your Tika installation path!
$path_to_tika = '/srv/bin/tika-app-1.18.jar';
// Execute the command.
$cmd = "java -jar {$path_to_tika} -t {$args['file']}";
@exec( $cmd, $output, $exitCode );
// If there was a problem, send the output to the debug log.
if ( $exitCode ) {
do_action( 'searchwp\debug\log', 'Error running Tika, exit code: ' . $exitCode );
}
return $output;
}, 20, 2 );
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.