Skip to content

Instantly share code, notes, and snippets.

@PharkMillups
Created August 23, 2010 16:15
11:49 <sfalcon> Using the Erlang pb client and a local dev cluster,
I'm having a problem where a write (seems to) succeed, but then I get
a notfound error when trying to fetch the doc.
11:50 <sfalcon> I tried passing [{w, 3}] in my call to riakc_pb_socket:put,
but I still get the notfounds ~7/10 tries.
11:51 <drev1> same connection for both requests/
11:51 <drev1> ?
11:51 <sfalcon> drev1: yes.
11:52 <drev1> what version of the pb client?
11:53 <benblack> how long after writing are you reading and with what r?
11:55 <sfalcon> This is pb client 0.2.0 but pulled from tip via rebar dep. The
read request directly follows the write request. Using the default r value for the pb client.
11:56 <sfalcon> drev1: pb client at 75:c728782d2790
11:58 <benblack> sfalcon: if you insert a 1 second delay before the read do you see
the same behavior?
12:01 <sfalcon> benblack: timer:sleep(1000) does not eliminate the problem. I must be doing
something dumb and am just not seeing it. There is a bit more relevant context...
12:02 <drev1> pastie of the put request?
12:02 <sfalcon> Running an integration test via eunit. I have two test functions.
Each creates its own pb client using riakc_pb_socket:start().
12:03 <sfalcon> Each test writes to same bucket, but different keys and deletes the doc
at start of test.
12:03 <sfalcon> If I comment out one test, the other passes. Only see the failures
when they both run? WTF?
12:06 <sfalcon> drev1: http://gist.github.com/540832
12:11 <drev1> sfalcon: anyway you could provide code that reproduces the issue?
12:17 <sfalcon> drev1: let me see what I can do. I'll ping you in a bit.
12:19 <sfalcon> In the meantime, is there a way to get more verbose logging info?
I'm looking at dev/dev1/log/erlang.log.X but not seeing much of interest.
12:20 <seancribbs> sfalcon: sasl-error.log will have more detail about crashes
12:21 <seancribbs> but often the standard error log will be informative enough
12:21 <seancribbs> run_erl does some weird stuff with its log rotation
12:22 <sfalcon> seancribbs: is there something more along the lines of access
logging? Was hoping to see the PUTs and GETs go by.
12:22 <sfalcon> (even though I'm using the pb interface)
12:23 <seancribbs> sfalcon: you can get stats, that's the best we have right now
12:50 <sfalcon> I see that I drev1 has left, but if someone else is available to take a look,
I have a small repro case. Perhaps someone can see where I'm going wrong in my assumptions...
12:50 <sfalcon> http://github.com/seth/testriak
12:51 <sfalcon> If you clone that repo and do "make && make test" you should see it.
Assumes a running riak cluster w/ pb at 8081
12:55 <seancribbs> try closing the client at the end of each test, or keeping
it in the process dictionary.
12:56 <seancribbs> bunch of use are leaving now for the airport
12:56 <seancribbs> s/use/us/
12:56 <seancribbs> l8r
12:56 <sfalcon> safe travels!
13:12 <dizzyd> sfalcon: ping
13:14 <dizzyd> gah, brb
13:17 <sfalcon> dizzyd: k, I'm around...
13:25 <dizzyd> sfalcon: investigating your issue
13:28 <dizzyd> sfalcon: so make test should break?
13:28 <dizzyd> it runs fine for me
13:29 <sfalcon> dizzyd: even when you run it a few times? I get working/not-working
13:30 <dizzyd> while (true); do make test;done
13:30 <dizzyd> that ran a couple of hundred iterations before I ctl-c
13:30 <sfalcon> dizzyd: I see more failures than successes, so something else
must be different about my environment. Does the test code seem reasonable?
13:31 <dizzyd> insofar as i can tell
13:31 <sfalcon> dizzyd: you can tell me, I'm not a doctor :-P
13:31 <dizzyd> ?
13:31 <sfalcon> dizzyd: sorry my sense of humor doesn't always translate to irc
13:32 <dizzyd> the only thing i would note
13:32 <dizzyd> is that you're spinning up a lot of sockets
13:32 <dizzyd> and never cleaning up
13:32 <dizzyd> so maybe with a low enough ulimit
13:32 <dizzyd> you might see some issues
13:33 <sfalcon> Per a suggestion from seancribbs, I modified to reuse the client by
stuffing it in the process dictionary. That didn't make a difference for me.
13:33 <sfalcon> How would I cleanup a client?
13:34 <dizzyd> i would do something like
13:34 <dizzyd> stop(Pid) ->
13:34 <dizzyd> Mref = erlang:monitor(process, Pid),
13:34 <dizzyd> receive
13:34 <dizzyd> {'DOWN', Mref, _, _, _} -> ok
13:34 <dizzyd> end.
13:34 <dizzyd> er
13:34 <dizzyd> after Mref...
13:34 <dizzyd> kill(Pid, normal).
13:35 <* dizzyd> fails at coding apparently
13:36 <sfalcon> ok, that's useful. Once the test ends and the vm running the
test shuts down, those sockets are cleaned of course...
13:36 <sfalcon> (I think)
13:36 <dizzyd> sure
13:37 <sfalcon> So the works for you and not for me seems to point to my local
riak cluster install?
13:37 <dizzyd> yup
13:37 <dizzyd> i'm just running a single riak node
13:37 <dizzyd> fwiw
13:38 <sfalcon> I'm running on OS X SL, cloned riak via git, did make and make devlrel.
Started dev1-3 and joined them. Client is always talking to dev1 node.
13:38 <sfalcon> Can you try with a 3-node cluster?
13:38 <dizzyd> sure
13:39 <dizzyd> wait a sec
13:39 <dizzyd> your riak_ip is set to 127.0.0.1
13:39 <dizzyd> are you running dev1 on loopback?
13:39 <dizzyd> n/m
13:39 <dizzyd> i'm just confused
13:40 acts_as_ joined
13:41 <dizzyd> hah
13:41 <dizzyd> tr: rt_3_test...*failed*
13:41 <dizzyd> ::error:{assertMatch_failed,[{module,tr},
13:41 <dizzyd> {line,66},
13:41 <dizzyd> {expression,"get_stuff ( Pid , Id )"},
13:41 <dizzyd> {expected,"{ stuff , [ [ x , y , z ] ] }"},
13:41 <dizzyd> {value,{error,notfound}}]}
13:41 <dizzyd> in function tr:'-rt_3_test/0-fun-0-'/1
13:41 <dizzyd> =======================================================
13:41 <dizzyd> Failed: 1. Skipped: 0. Passed: 2.
13:41 <dizzyd> is that the error you see?
13:41 <sfalcon> dizzyd: yup. I get that failure ~70% or more of the time, but if you run a
few times you will see successes too.
13:43 <dizzyd> hmmm
13:43 <dizzyd> ok, i'm on it
13:51 <sfalcon> dizzyd: Thanks for looking. Any ideas or things I can help investigate?
13:55 <dizzyd> not at the moment
13:59 <dizzyd> ah
13:59 <dizzyd> i think i know what's going on
13:59 <dizzyd> you specify a {w, 3} on puts
14:00 <dizzyd> but not the same on delete
14:00 <dizzyd> so, theoretically
14:00 <dizzyd> the delete returns before all the nodes have acknowledge
14:02 <dizzyd> yeah
14:02 <dizzyd> welcome the land of inconsistency
14:02 <dizzyd> the quorom on your deletes is less than that of your writes
14:02 <dizzyd> if you set [{rw, 3}]
14:02 <dizzyd> on your delete
14:02 <dizzyd> things will work as expected
14:03 <sfalcon> dizzyd: but if the delete didn't go through then wouldn't I get back data
instead of notfound?
14:04 <dizzyd> bah
14:04 <* dizzyd> discards his theory
14:05 <dizzyd> setting it helps
14:05 <dizzyd> but not completely
14:05 <dizzyd> something wonky is going on here
14:05 <sfalcon> I started using just the defaults in the pb client. Then added the {w, 3}
while trying to understand what was happening.
14:05 <dizzyd> i'm afraid you've got the Friday Afternoon Dizzy™
14:06 <sfalcon> 8-)
14:08 <sfalcon> Wonder if I'm running into N=3, but not always onto 3 separate real nodes
14:12 <sfalcon> be back in a few... going to grab some lunch
15:07 <dizzyd> sfalcon: ping
15:07 <sfalcon> dizzyd: pong
15:07 <dizzyd> so...
15:08 <dizzyd> what's happening is that the deletion is mildly async
15:08 <dizzyd> thus, the return from delete CAN occur before the deletion is complete
15:08 <dizzyd> and the put MAY run before the completion
15:08 <dizzyd> so you wind up with
15:08 <sfalcon> oooh, that's a good one :-P
15:08 <dizzyd> delete (pt 1)/put/delete (pt2)/get
15:09 <dizzyd> it is intentional though
15:09 <dizzyd> but that doesn't make it Right
15:09 <sfalcon> certainly... subtle
15:10 <benblack> i think crucially hit the same issue
15:12 <seancribbs> benblack: he did
15:12 <seancribbs> we have it documented
15:13 <dizzyd> seancribbs: do tell, where?
15:14 <seancribbs> i would tell you if my 3G weren't dog slow in the airport. one min
15:14 <dizzyd> heh, thx
15:17 allen_ joined
15:20 <sfalcon> dizzyd: thanks for helping to track that down. I can easily work
around that as I was only doing deletes to have a clean test environment. Will be interested
to read up on what's been documented and to understand more.
15:20 seancribbs joined
15:21 <seancribbs> dizzyd: https://issues.basho.com/show_bug.cgi?id=260
15:21 <dizzyd> thx
15:30 <* dizzyd> ponders
15:30 <dizzyd> i would argue that a client should always submit a vclock when doing a delete
15:30 <dizzyd> otherwise you get weird behaviour like this
15:31 <seancribbs> dizzyd: +1
15:31 <seancribbs> that is, if you want to care about it
15:31 <seancribbs> right now the delete resource does it on behalf of the client
15:31 <seancribbs> it has lots of race-condition potential
15:32 <dizzyd> i think if you _don't_ care about it you lose data
15:32 <dizzyd> and that's not acceptable
15:32 <dizzyd> consider
15:32 <dizzyd> network split
15:32 <dizzyd> one side does a delete
15:32 <dizzyd> other doesn't
15:32 <dizzyd> with our current setup
15:32 <dizzyd> hum
15:32 <dizzyd> sticky
15:32 <seancribbs> dizzyd: right but that wouldn't read-repair-reap
15:32 <seancribbs> well, depending on when the split happens
15:32 <dizzyd> yes
15:33 <dizzyd> cardinal rule of race conditions: if they _can_ occur, they will occur in production.
15:33 <seancribbs> indeed
15:34 <seancribbs> sounds like we need some state machine QC on taht
15:34 <* seancribbs> waves his QC flag
15:34 <dizzyd> you'd really need to use QC and Pulse together
15:34 <dizzyd> to reproduce that one
15:34 <seancribbs> right
15:34 <dizzyd> since you'll have to monkey with scheduling
15:35 <seancribbs> i suppose it depends on how you mock the vnode master
15:35 <seancribbs> we got away with no statem on the get_fsm
15:35 <seancribbs> but this is multiple fsms i guess
15:35 <dizzyd> yup
15:35 <dizzyd> it'd be a tricky test…my fav kind :)
15:35 <seancribbs> yay
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment