Created
October 29, 2010 23:33
-
-
Save PharkMillups/654652 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
09:11 <reiddraper> morning all | |
09:14 <reiddraper> I've got a riak search question, if anyone's around | |
09:22 <chids> I'm around but I'm not sure I can help - haven't really dug into search yet. But feel | |
free to fire away :) | |
09:24 <reiddraper> ok -- with solr/lucene, you have to call `commit` for added/updated documents | |
to be retrievable via a query, this doesn't seem to be the case with riak search, but | |
i wanted to verify | |
09:24 <chids> I'm pretty sure that's not the case | |
09:24 <chids> Riak Search plugs into the Riak K/V store using the post-commit hook | |
09:25 <chids> As to at what stage the "search engine" itself commits data to the index , I have no idea. | |
09:25 <reiddraper> right, ok | |
09:25 <chids> Info on pre/post commit hooks can be found here: https://wiki.basho.com/display/RIAK/Pre-+and+Post-Commit+Hooks | |
09:27 <reiddraper> thanks | |
09:27 <chids> I would simply assume that when you've stored a document it "immediately" becomes | |
available in the index. | |
09:27 <chids> Immediately being as soon as the analyzer/indexer has done it's work | |
09:28 <reiddraper> that's how it appears in the riak search tutorial, but because i know | |
\if uses lucene under the hood, I was curious about some of those inner mechanisms | |
09:28 <chids> Then I'm afraid I'm not able to help you | |
09:30 <reiddraper> no problem | |
09:30 <jlouis> I would expect eventual indexing. That is, after a store and some time elapsed, the new | |
document will be in the index. | |
09:31 <chids> jlouis: Absolutely. First it has to be processed. Then there's the *probably* the | |
possibility of a delay for distribution between nodes in the cluster. | |
09:36 <pharkmillups> reiddraper: rlophaus would be the person to handle those type of questions. | |
He's not around at the moment - neck deep in some search code :) Your best bet is to use the Riak Mailing List. | |
09:37 <pharkmillups> s/rlophaus/rklophaus | |
09:37 <reiddraper> thanks | |
10:03 <reiddraper> is it possible to use riak for map-reduce processing, rather than queries? | |
for example, instead of returning the results of map, i'd like to just store the result in another key? | |
10:12 <justinsheehy> reiddraper: re your earlier question, the indexing is near-real-time. data | |
is indexed incrementally, not in batch commits. | |
10:12 <justinsheehy> and thus should be searchable very shortly after the KV storage operation | |
10:13 <reiddraper> justinsheehy: thanks. and I guess you never need to call `optimize` on the index? | |
10:13 <justinsheehy> nope! | |
10:13 <reiddraper> that's awesome | |
10:14 <justinsheehy> it doesn't use lucene, by the way. it can use lucene's analyzer classes, | |
but lucene does not power the indexing or retreival. | |
10:14 <justinsheehy> C-t | |
10:15 <reiddraper> ah, ok. I misunderstood that. | |
10:15 <justinsheehy> hard to get away from batch commits with lucene | |
10:15 <justinsheehy> on your map/reduce question, there is no built-in functionality to store | |
results instead of streaming them out. | |
10:15 <justinsheehy> could be done, but doesn't currently exist. | |
10:16 <reiddraper> ok, could be cool to be able to use riak like a hadoop cluster | |
10:19 <justinsheehy> reiddraper: could be cool indeed. wasn't part of the original idea, | |
and in fact bulk-processing at hadoop-like throughput requires different compromises than many of Riak's goals. | |
10:19 <justinsheehy> and so Hadoop and Riak are generally complementary tech, even if each | |
can do some of what the other is capable of. | |
10:20 <reiddraper> right, makes sense | |
10:23 <_sri> would be really cool if java was optional for riak-search | |
10:23 <reiddraper> justinsheehy: thanks for answering my questions | |
10:24 <justinsheehy> reiddraper: happy to help | |
10:25 <justinsheehy> _sri: it's almost optional now. there are some built-in native erlang | |
analyzers, but some other issues like the logic of whether or not to try to start the JVM | |
needs some care before it can be fully optional. | |
10:26 <bingeldac> reiddraper: https://gist.github.com/1cfec81c2425e9d99d0a | |
10:26 <bingeldac> that is something I did with a customer, an erlang reduce phase to save the data | |
10:26 <bingeldac> not sure if it meets your needs, but I thought I would toss it out | |
10:28 <reiddraper> bingeldac: thanks. it's enough to know that you can create and save keys | |
with the erlang m/r api | |
10:28 <_sri> justinsheehy: looking forward to it :) | |
10:29 <_sri> thanks for riak btw. very impressed so far | |
10:29 <justinsheehy> _sri: I should be clear, no one that I know of is focusing on making java optional | |
right now. it's not fundamentally hard but also not super high priority unless someone jumps on it. | |
10:30 <_sri> :/ | |
10:30 <justinsheehy> _sri: glad you're enjoying it |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment