Skip to content

Instantly share code, notes, and snippets.

@debloper
Last active December 19, 2015 00:59
Show Gist options
  • Save debloper/5872421 to your computer and use it in GitHub Desktop.
Save debloper/5872421 to your computer and use it in GitHub Desktop.
Resolving CSV River 1.0.1 to work with ElasticSearch 0.90.1

The Challenge:

Implement a river for indexing CSV files, in the running ElasticSearch 0.90.1 instance.

Low Hanging Fruit:

As listed in ES' plugins page, ES-CSV-River seemed to be able to do the job easily.

Blockers:

  • To Site, or Not To Site: it wasn't possible to install CSV-River the usual way ($ bin/plugin -install xxBedy/elasticsearch-river-csv). ES plugin-system was detecting it to be a site-plugin, but for having the Java files in it, it was confused & aborted plugin-installations.
  • Call 911 (Maven, that is): Got JAR files built from the source with Maven & installed it - that seemed to work!
  • Processing, I see: Now that I create the river, with sample JSON fit to work with the sample CSV I generated - it seems to start indexing the file, but breaking just after. It adds the .processing to the file's extension & fails, reporting traceback for Exception in opencsv.
    Exception in thread "elasticsearch[Recorder][CSV processor][T#1]" java.lang.NoClassDefFoundError: au/com/bytecode/opencsv/CSVReader
	at org.elasticsearch.river.csv.CSVRiver$CSVConnector.processFile(CSVRiver.java:232)
	at org.elasticsearch.river.csv.CSVRiver$CSVConnector.run(CSVRiver.java:193)
	at java.lang.Thread.run(Thread.java:724)
    Caused by: java.lang.ClassNotFoundException: au.com.bytecode.opencsv.CSVReader
	at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	... 3 more

Possible Reasons:

Next Up:

  • Setting up my personal Maven environment to wrangle with building the plugin from source
  • Replace the OpenCSV's CSV parser class with something custom.
  • Write a whole-new CSV-River plugin, to scratch my own itch.

Solution:

David Pilato stepped in with a solution - that fixed the issue.

Apparently, I was doing it wrong to try to install the plugin from Maven generated JAR file. The ZIP builds in releases had the opencsv.jar dependency bundled into it. Previously, OpenCSV was being required by the plugin, but the indexing was failing for not having it linked.

@brightcode
Copy link

This also worked for me:

git clone https://github.com/xxBedy/elasticsearch-river-csv.git
mvn package
/path-to/elasticsearch-0.90.10/bin/plugin --install river-csv --url file:///path-to/elasticsearch-river-csv/target/releases/elasticsearch-river-csv-1.0.1.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment