Created
January 6, 2011 01:12
-
-
Save PharkMillups/767341 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
14:22 <sragu> I have a question with riak search | |
14:22 <sragu> The default indexing of xml documents uses the | |
element name with parent path | |
14:22 <sragu> list of elements in xml document are indexed as | |
single name. How could I index them separately? | |
14:23 <sragu> how could I customize the index created for a xml | |
document with riak? | |
14:32 <rustyk> sragu: You have two choices… one is that you can | |
do some preprocessing on the XML document before sending it to search, | |
renaming fields to keep them distinct | |
14:33 <rustyk> sragu: The other option is to create a custom | |
extractor for your documents, but if you go down this route | |
I would wait for the next release (0.14) release, as the current | |
release has a tricky bug around this support | |
14:33 <rustyk> sragu: An extractor inspects your document | |
and generates Field/Value pairs, so it can return whatever | |
field names you would like. | |
14:43 <sragu> rustyk: is the Lucene Analyzer will do the same as the extractor? | |
14:46 <rustyk> sragu: no, the extractor takes a document | |
and extracts Field/Value pairs, the analyzers take a | |
Field/Value pair and convert the Value into tokens. | |
They are different things, a two stage process. | |
14:48 <sragu> gotcha. thanks. | |
14:51 <sragu> rustyk: Can I write the extractor in java, | |
I see that the current extractor code is in Erlang? | |
14:52 <rustyk> sragu: There is no support for that yet. | |
Currently your options are Erlang and Javascript | |
14:56 <sragu> rustyk: Writing a custom extractor is a | |
standard way of tackling this issue? Will future releases | |
of riak would continue supporting this custom extractor feature? | |
14:59 <rustyk> sragu: Yes, the extractors were made | |
extensible for exactly this reason. Future versions of Search | |
will continue to support extractors, though there's always a | |
chance that interfaces will change as we learn more about how | |
people use them | |
15:52 <sragu> rustyk: Thanks |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment