Created
August 23, 2010 16:15
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
11:49 <sfalcon> Using the Erlang pb client and a local dev cluster, | |
I'm having a problem where a write (seems to) succeed, but then I get | |
a notfound error when trying to fetch the doc. | |
11:50 <sfalcon> I tried passing [{w, 3}] in my call to riakc_pb_socket:put, | |
but I still get the notfounds ~7/10 tries. | |
11:51 <drev1> same connection for both requests/ | |
11:51 <drev1> ? | |
11:51 <sfalcon> drev1: yes. | |
11:52 <drev1> what version of the pb client? | |
11:53 <benblack> how long after writing are you reading and with what r? | |
11:55 <sfalcon> This is pb client 0.2.0 but pulled from tip via rebar dep. The | |
read request directly follows the write request. Using the default r value for the pb client. | |
11:56 <sfalcon> drev1: pb client at 75:c728782d2790 | |
11:58 <benblack> sfalcon: if you insert a 1 second delay before the read do you see | |
the same behavior? | |
12:01 <sfalcon> benblack: timer:sleep(1000) does not eliminate the problem. I must be doing | |
something dumb and am just not seeing it. There is a bit more relevant context... | |
12:02 <drev1> pastie of the put request? | |
12:02 <sfalcon> Running an integration test via eunit. I have two test functions. | |
Each creates its own pb client using riakc_pb_socket:start(). | |
12:03 <sfalcon> Each test writes to same bucket, but different keys and deletes the doc | |
at start of test. | |
12:03 <sfalcon> If I comment out one test, the other passes. Only see the failures | |
when they both run? WTF? | |
12:06 <sfalcon> drev1: http://gist.github.com/540832 | |
12:11 <drev1> sfalcon: anyway you could provide code that reproduces the issue? | |
12:17 <sfalcon> drev1: let me see what I can do. I'll ping you in a bit. | |
12:19 <sfalcon> In the meantime, is there a way to get more verbose logging info? | |
I'm looking at dev/dev1/log/erlang.log.X but not seeing much of interest. | |
12:20 <seancribbs> sfalcon: sasl-error.log will have more detail about crashes | |
12:21 <seancribbs> but often the standard error log will be informative enough | |
12:21 <seancribbs> run_erl does some weird stuff with its log rotation | |
12:22 <sfalcon> seancribbs: is there something more along the lines of access | |
logging? Was hoping to see the PUTs and GETs go by. | |
12:22 <sfalcon> (even though I'm using the pb interface) | |
12:23 <seancribbs> sfalcon: you can get stats, that's the best we have right now | |
12:50 <sfalcon> I see that I drev1 has left, but if someone else is available to take a look, | |
I have a small repro case. Perhaps someone can see where I'm going wrong in my assumptions... | |
12:50 <sfalcon> http://github.com/seth/testriak | |
12:51 <sfalcon> If you clone that repo and do "make && make test" you should see it. | |
Assumes a running riak cluster w/ pb at 8081 | |
12:55 <seancribbs> try closing the client at the end of each test, or keeping | |
it in the process dictionary. | |
12:56 <seancribbs> bunch of use are leaving now for the airport | |
12:56 <seancribbs> s/use/us/ | |
12:56 <seancribbs> l8r | |
12:56 <sfalcon> safe travels! | |
13:12 <dizzyd> sfalcon: ping | |
13:14 <dizzyd> gah, brb | |
13:17 <sfalcon> dizzyd: k, I'm around... | |
13:25 <dizzyd> sfalcon: investigating your issue | |
13:28 <dizzyd> sfalcon: so make test should break? | |
13:28 <dizzyd> it runs fine for me | |
13:29 <sfalcon> dizzyd: even when you run it a few times? I get working/not-working | |
13:30 <dizzyd> while (true); do make test;done | |
13:30 <dizzyd> that ran a couple of hundred iterations before I ctl-c | |
13:30 <sfalcon> dizzyd: I see more failures than successes, so something else | |
must be different about my environment. Does the test code seem reasonable? | |
13:31 <dizzyd> insofar as i can tell | |
13:31 <sfalcon> dizzyd: you can tell me, I'm not a doctor :-P | |
13:31 <dizzyd> ? | |
13:31 <sfalcon> dizzyd: sorry my sense of humor doesn't always translate to irc | |
13:32 <dizzyd> the only thing i would note | |
13:32 <dizzyd> is that you're spinning up a lot of sockets | |
13:32 <dizzyd> and never cleaning up | |
13:32 <dizzyd> so maybe with a low enough ulimit | |
13:32 <dizzyd> you might see some issues | |
13:33 <sfalcon> Per a suggestion from seancribbs, I modified to reuse the client by | |
stuffing it in the process dictionary. That didn't make a difference for me. | |
13:33 <sfalcon> How would I cleanup a client? | |
13:34 <dizzyd> i would do something like | |
13:34 <dizzyd> stop(Pid) -> | |
13:34 <dizzyd> Mref = erlang:monitor(process, Pid), | |
13:34 <dizzyd> receive | |
13:34 <dizzyd> {'DOWN', Mref, _, _, _} -> ok | |
13:34 <dizzyd> end. | |
13:34 <dizzyd> er | |
13:34 <dizzyd> after Mref... | |
13:34 <dizzyd> kill(Pid, normal). | |
13:35 <* dizzyd> fails at coding apparently | |
13:36 <sfalcon> ok, that's useful. Once the test ends and the vm running the | |
test shuts down, those sockets are cleaned of course... | |
13:36 <sfalcon> (I think) | |
13:36 <dizzyd> sure | |
13:37 <sfalcon> So the works for you and not for me seems to point to my local | |
riak cluster install? | |
13:37 <dizzyd> yup | |
13:37 <dizzyd> i'm just running a single riak node | |
13:37 <dizzyd> fwiw | |
13:38 <sfalcon> I'm running on OS X SL, cloned riak via git, did make and make devlrel. | |
Started dev1-3 and joined them. Client is always talking to dev1 node. | |
13:38 <sfalcon> Can you try with a 3-node cluster? | |
13:38 <dizzyd> sure | |
13:39 <dizzyd> wait a sec | |
13:39 <dizzyd> your riak_ip is set to 127.0.0.1 | |
13:39 <dizzyd> are you running dev1 on loopback? | |
13:39 <dizzyd> n/m | |
13:39 <dizzyd> i'm just confused | |
13:40 acts_as_ joined | |
13:41 <dizzyd> hah | |
13:41 <dizzyd> tr: rt_3_test...*failed* | |
13:41 <dizzyd> ::error:{assertMatch_failed,[{module,tr}, | |
13:41 <dizzyd> {line,66}, | |
13:41 <dizzyd> {expression,"get_stuff ( Pid , Id )"}, | |
13:41 <dizzyd> {expected,"{ stuff , [ [ x , y , z ] ] }"}, | |
13:41 <dizzyd> {value,{error,notfound}}]} | |
13:41 <dizzyd> in function tr:'-rt_3_test/0-fun-0-'/1 | |
13:41 <dizzyd> ======================================================= | |
13:41 <dizzyd> Failed: 1. Skipped: 0. Passed: 2. | |
13:41 <dizzyd> is that the error you see? | |
13:41 <sfalcon> dizzyd: yup. I get that failure ~70% or more of the time, but if you run a | |
few times you will see successes too. | |
13:43 <dizzyd> hmmm | |
13:43 <dizzyd> ok, i'm on it | |
13:51 <sfalcon> dizzyd: Thanks for looking. Any ideas or things I can help investigate? | |
13:55 <dizzyd> not at the moment | |
13:59 <dizzyd> ah | |
13:59 <dizzyd> i think i know what's going on | |
13:59 <dizzyd> you specify a {w, 3} on puts | |
14:00 <dizzyd> but not the same on delete | |
14:00 <dizzyd> so, theoretically | |
14:00 <dizzyd> the delete returns before all the nodes have acknowledge | |
14:02 <dizzyd> yeah | |
14:02 <dizzyd> welcome the land of inconsistency | |
14:02 <dizzyd> the quorom on your deletes is less than that of your writes | |
14:02 <dizzyd> if you set [{rw, 3}] | |
14:02 <dizzyd> on your delete | |
14:02 <dizzyd> things will work as expected | |
14:03 <sfalcon> dizzyd: but if the delete didn't go through then wouldn't I get back data | |
instead of notfound? | |
14:04 <dizzyd> bah | |
14:04 <* dizzyd> discards his theory | |
14:05 <dizzyd> setting it helps | |
14:05 <dizzyd> but not completely | |
14:05 <dizzyd> something wonky is going on here | |
14:05 <sfalcon> I started using just the defaults in the pb client. Then added the {w, 3} | |
while trying to understand what was happening. | |
14:05 <dizzyd> i'm afraid you've got the Friday Afternoon Dizzy™ | |
14:06 <sfalcon> 8-) | |
14:08 <sfalcon> Wonder if I'm running into N=3, but not always onto 3 separate real nodes | |
14:12 <sfalcon> be back in a few... going to grab some lunch | |
15:07 <dizzyd> sfalcon: ping | |
15:07 <sfalcon> dizzyd: pong | |
15:07 <dizzyd> so... | |
15:08 <dizzyd> what's happening is that the deletion is mildly async | |
15:08 <dizzyd> thus, the return from delete CAN occur before the deletion is complete | |
15:08 <dizzyd> and the put MAY run before the completion | |
15:08 <dizzyd> so you wind up with | |
15:08 <sfalcon> oooh, that's a good one :-P | |
15:08 <dizzyd> delete (pt 1)/put/delete (pt2)/get | |
15:09 <dizzyd> it is intentional though | |
15:09 <dizzyd> but that doesn't make it Right | |
15:09 <sfalcon> certainly... subtle | |
15:10 <benblack> i think crucially hit the same issue | |
15:12 <seancribbs> benblack: he did | |
15:12 <seancribbs> we have it documented | |
15:13 <dizzyd> seancribbs: do tell, where? | |
15:14 <seancribbs> i would tell you if my 3G weren't dog slow in the airport. one min | |
15:14 <dizzyd> heh, thx | |
15:17 allen_ joined | |
15:20 <sfalcon> dizzyd: thanks for helping to track that down. I can easily work | |
around that as I was only doing deletes to have a clean test environment. Will be interested | |
to read up on what's been documented and to understand more. | |
15:20 seancribbs joined | |
15:21 <seancribbs> dizzyd: https://issues.basho.com/show_bug.cgi?id=260 | |
15:21 <dizzyd> thx | |
15:30 <* dizzyd> ponders | |
15:30 <dizzyd> i would argue that a client should always submit a vclock when doing a delete | |
15:30 <dizzyd> otherwise you get weird behaviour like this | |
15:31 <seancribbs> dizzyd: +1 | |
15:31 <seancribbs> that is, if you want to care about it | |
15:31 <seancribbs> right now the delete resource does it on behalf of the client | |
15:31 <seancribbs> it has lots of race-condition potential | |
15:32 <dizzyd> i think if you _don't_ care about it you lose data | |
15:32 <dizzyd> and that's not acceptable | |
15:32 <dizzyd> consider | |
15:32 <dizzyd> network split | |
15:32 <dizzyd> one side does a delete | |
15:32 <dizzyd> other doesn't | |
15:32 <dizzyd> with our current setup | |
15:32 <dizzyd> hum | |
15:32 <dizzyd> sticky | |
15:32 <seancribbs> dizzyd: right but that wouldn't read-repair-reap | |
15:32 <seancribbs> well, depending on when the split happens | |
15:32 <dizzyd> yes | |
15:33 <dizzyd> cardinal rule of race conditions: if they _can_ occur, they will occur in production. | |
15:33 <seancribbs> indeed | |
15:34 <seancribbs> sounds like we need some state machine QC on taht | |
15:34 <* seancribbs> waves his QC flag | |
15:34 <dizzyd> you'd really need to use QC and Pulse together | |
15:34 <dizzyd> to reproduce that one | |
15:34 <seancribbs> right | |
15:34 <dizzyd> since you'll have to monkey with scheduling | |
15:35 <seancribbs> i suppose it depends on how you mock the vnode master | |
15:35 <seancribbs> we got away with no statem on the get_fsm | |
15:35 <seancribbs> but this is multiple fsms i guess | |
15:35 <dizzyd> yup | |
15:35 <dizzyd> it'd be a tricky test…my fav kind :) | |
15:35 <seancribbs> yay |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment