Skip to content

Instantly share code, notes, and snippets.

@karussell
Last active October 30, 2023 16:14
Show Gist options
  • Star 17 You must be signed in to star a gist
  • Fork 6 You must be signed in to fork a gist
  • Save karussell/2878392 to your computer and use it in GitHub Desktop.
Save karussell/2878392 to your computer and use it in GitHub Desktop.
ElasticSearch from SQL DB
Why is there no such DataImportHandler thing in ElasticSearch? Uhm, well ... but because:
1. You should really consider your own scripts
(be it jvm based, perl, ruby, php, nodejs/javascript)
to feed ElasticSearch via bulk indexing:
http://www.elasticsearch.org/guide/reference/java-api/bulk.html
2. There are two projects doing it already:
* http://code.google.com/p/sql-to-nosql-importer/
* https://github.com/Aconex/scrutineer (keeps DB in synch with ES or solr!)
3. In theorie you could use DIH in elasticsearch as well ;)
http://www.mattweber.org/2011/12/14/elasticsearch-mock-solr-plugin/
4. In mysql you could add an http trigger:
http://code.google.com/p/mysql-udf-http
5. You could use hydra and start with the 'correct' & scalable
solution: https://github.com/Findwise/Hydra
@karussell
Copy link
Author

a guy from twitter (@joelwes) suggested me for the sql part the manifoldCF project. Have a look:

http://incubator.apache.org/connectors/en_US/index.html

@thg303
Copy link

thg303 commented Aug 8, 2017

@karussell They are out of incubator now https://manifoldcf.apache.org/
and as the documentation says:

Apache ManifoldCF is an effort to provide an open source framework for connecting source content repositories like Microsoft Sharepoint and EMC Documentum, to target repositories or indexes, such as Apache Solr, Open Search Server, or ElasticSearch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment