PharkMillups/gist:545789

## gistfile1.txt
11:49 <sfalcon> Using the Erlang pb client and a local dev cluster,
I'm having a problem where a write (seems to) succeed, but then I get
a notfound error when trying to fetch the doc.

11:50 <sfalcon> I tried passing [{w, 3}] in my call to riakc_pb_socket:put,
but I still get the notfounds ~7/10 tries.

11:51 <drev1> same connection for both requests/

11:51 <drev1> ?

11:51 <sfalcon> drev1: yes.

11:52 <drev1> what version of the pb client?

11:53 <benblack> how long after writing are you reading and with what r?

11:55 <sfalcon> This is pb client 0.2.0 but pulled from tip via rebar dep. The
read request directly follows the write request. Using the default r value for the pb client.

11:56 <sfalcon> drev1: pb client at 75:c728782d2790

11:58 <benblack> sfalcon: if you insert a 1 second delay before the read do you see
the same behavior?

12:01 <sfalcon> benblack: timer:sleep(1000) does not eliminate the problem. I must be doing
something dumb and am just not seeing it. There is a bit more relevant context...

12:02 <drev1> pastie of the put request?

12:02 <sfalcon> Running an integration test via eunit. I have two test functions.
Each creates its own pb client using riakc_pb_socket:start().

12:03 <sfalcon> Each test writes to same bucket, but different keys and deletes the doc
at start of test.

12:03 <sfalcon> If I comment out one test, the other passes. Only see the failures
when they both run? WTF?

12:06 <sfalcon> drev1: http://gist.github.com/540832

12:11 <drev1> sfalcon: anyway you could provide code that reproduces the issue?

12:17 <sfalcon> drev1: let me see what I can do. I'll ping you in a bit.

12:19 <sfalcon> In the meantime, is there a way to get more verbose logging info?
I'm looking at dev/dev1/log/erlang.log.X but not seeing much of interest.

12:20 <seancribbs> sfalcon: sasl-error.log will have more detail about crashes

12:21 <seancribbs> but often the standard error log will be informative enough

12:21 <seancribbs> run_erl does some weird stuff with its log rotation

12:22 <sfalcon> seancribbs: is there something more along the lines of access
logging? Was hoping to see the PUTs and GETs go by.

12:22 <sfalcon> (even though I'm using the pb interface)

12:23 <seancribbs> sfalcon: you can get stats, that's the best we have right now

12:50 <sfalcon> I see that I drev1 has left, but if someone else is available to take a look,
I have a small repro case. Perhaps someone can see where I'm going wrong in my assumptions...

12:50 <sfalcon> http://github.com/seth/testriak

12:51 <sfalcon> If you clone that repo and do "make && make test" you should see it.
Assumes a running riak cluster w/ pb at 8081

12:55 <seancribbs> try closing the client at the end of each test, or keeping
it in the process dictionary.

12:56 <seancribbs> bunch of use are leaving now for the airport

12:56 <seancribbs> s/use/us/

12:56 <seancribbs> l8r

12:56 <sfalcon> safe travels!

13:12 <dizzyd> sfalcon: ping

13:14 <dizzyd> gah, brb

13:17 <sfalcon> dizzyd: k, I'm around...

13:25 <dizzyd> sfalcon: investigating your issue

13:28 <dizzyd> sfalcon: so make test should break?

13:28 <dizzyd> it runs fine for me

13:29 <sfalcon> dizzyd: even when you run it a few times? I get working/not-working

13:30 <dizzyd> while (true); do make test;done

13:30 <dizzyd> that ran a couple of hundred iterations before I ctl-c

13:30 <sfalcon> dizzyd: I see more failures than successes, so something else
must be different about my environment. Does the test code seem reasonable?

13:31 <dizzyd> insofar as i can tell

13:31 <sfalcon> dizzyd: you can tell me, I'm not a doctor :-P

13:31 <dizzyd> ?

13:31 <sfalcon> dizzyd: sorry my sense of humor doesn't always translate to irc

13:32 <dizzyd> the only thing i would note

13:32 <dizzyd> is that you're spinning up a lot of sockets

13:32 <dizzyd> and never cleaning up

13:32 <dizzyd> so maybe with a low enough ulimit

13:32 <dizzyd> you might see some issues

13:33 <sfalcon> Per a suggestion from seancribbs, I modified to reuse the client by
stuffing it in the process dictionary. That didn't make a difference for me.

13:33 <sfalcon> How would I cleanup a client?

13:34 <dizzyd> i would do something like

13:34 <dizzyd> stop(Pid) ->

13:34 <dizzyd> Mref = erlang:monitor(process, Pid),

13:34 <dizzyd> receive

13:34 <dizzyd> {'DOWN', Mref, _, _, _} -> ok

13:34 <dizzyd> end.

13:34 <dizzyd> er

13:34 <dizzyd> after Mref...

13:34 <dizzyd> kill(Pid, normal).

13:35 <* dizzyd> fails at coding apparently

13:36 <sfalcon> ok, that's useful. Once the test ends and the vm running the
test shuts down,  those sockets are cleaned of course...

13:36 <sfalcon> (I think)

13:36 <dizzyd> sure

13:37 <sfalcon> So the works for you and not for me seems to point to my local
riak cluster install?

13:37 <dizzyd> yup

13:37 <dizzyd> i'm just running a single riak node

13:37 <dizzyd> fwiw

13:38 <sfalcon> I'm running on OS X SL, cloned riak via git, did make and make devlrel.
Started dev1-3 and joined them. Client is always talking to dev1 node.

13:38 <sfalcon> Can you try with a 3-node cluster?

13:38 <dizzyd> sure

13:39 <dizzyd> wait a sec

13:39 <dizzyd> your riak_ip is set to 127.0.0.1

13:39 <dizzyd> are you running dev1 on loopback?

13:39 <dizzyd> n/m

13:39 <dizzyd> i'm just confused

13:40 acts_as_ joined

13:41 <dizzyd> hah

13:41 <dizzyd> tr: rt_3_test...*failed*

13:41 <dizzyd> ::error:{assertMatch_failed,[{module,tr},

13:41 <dizzyd> {line,66},

13:41 <dizzyd> {expression,"get_stuff ( Pid , Id )"},

13:41 <dizzyd> {expected,"{ stuff , [ [ x , y , z ] ] }"},

13:41 <dizzyd> {value,{error,notfound}}]}

13:41 <dizzyd> in function tr:'-rt_3_test/0-fun-0-'/1

13:41 <dizzyd> =======================================================

13:41 <dizzyd> Failed: 1. Skipped: 0. Passed: 2.

13:41 <dizzyd> is that the error you see?

13:41 <sfalcon> dizzyd: yup. I get that failure ~70% or more of the time, but if you run a
few times you will see successes too.

13:43 <dizzyd> hmmm

13:43 <dizzyd> ok, i'm on it

13:51 <sfalcon> dizzyd: Thanks for looking. Any ideas or things I can help investigate?

13:55 <dizzyd> not at the moment

13:59 <dizzyd> ah

13:59 <dizzyd> i think i know what's going on

13:59 <dizzyd> you specify a {w, 3} on puts

14:00 <dizzyd> but not the same on delete

14:00 <dizzyd> so, theoretically

14:00 <dizzyd> the delete returns before all the nodes have acknowledge

14:02 <dizzyd> yeah

14:02 <dizzyd> welcome the land of inconsistency

14:02 <dizzyd> the quorom on your deletes is less than that of your writes

14:02 <dizzyd> if you set [{rw, 3}]

14:02 <dizzyd> on your delete

14:02 <dizzyd> things will work as expected

14:03 <sfalcon> dizzyd: but if the delete didn't go through then wouldn't I get back data
instead of notfound?

14:04 <dizzyd> bah

14:04 <* dizzyd> discards his theory

14:05 <dizzyd> setting it helps

14:05 <dizzyd> but not completely

14:05 <dizzyd> something wonky is going on here

14:05 <sfalcon> I started using just the defaults in the pb client. Then added the {w, 3}
while trying to understand what was happening.

14:05 <dizzyd> i'm afraid you've got the Friday Afternoon Dizzy™

14:06 <sfalcon> 8-)

14:08 <sfalcon> Wonder if I'm running into N=3, but not always onto 3 separate real nodes

14:12 <sfalcon> be back in a few... going to grab some lunch

15:07 <dizzyd> sfalcon: ping

15:07 <sfalcon> dizzyd: pong

15:07 <dizzyd> so...

15:08 <dizzyd> what's happening is that the deletion is mildly async

15:08 <dizzyd> thus, the return from delete CAN occur before the deletion is complete

15:08 <dizzyd> and the put MAY run before the completion

15:08 <dizzyd> so you wind up with

15:08 <sfalcon> oooh, that's a good one :-P

15:08 <dizzyd> delete (pt 1)/put/delete (pt2)/get

15:09 <dizzyd> it is intentional though

15:09 <dizzyd> but that doesn't make it Right

15:09 <sfalcon> certainly... subtle

15:10 <benblack> i think crucially hit the same issue

15:12 <seancribbs> benblack: he did

15:12 <seancribbs> we have it documented

15:13 <dizzyd> seancribbs: do tell, where?

15:14 <seancribbs> i would tell you if my 3G weren't dog slow in the airport. one min

15:14 <dizzyd> heh, thx

15:17 allen_ joined

15:20 <sfalcon> dizzyd: thanks for helping to track that down. I can easily work
around that as I was only doing deletes to have a clean test environment. Will be interested
to read up on what's been documented and to understand more.

15:20 seancribbs joined

15:21 <seancribbs> dizzyd: https://issues.basho.com/show_bug.cgi?id=260

15:21 <dizzyd> thx

15:30 <* dizzyd> ponders

15:30 <dizzyd> i would argue that a client should always submit a vclock when doing a delete

15:30 <dizzyd> otherwise you get weird behaviour like this

15:31 <seancribbs> dizzyd: +1

15:31 <seancribbs> that is, if you want to care about it

15:31 <seancribbs> right now the delete resource does it on behalf of the client

15:31 <seancribbs> it has lots of race-condition potential

15:32 <dizzyd> i think if you _don't_ care about it you lose data

15:32 <dizzyd> and that's not acceptable

15:32 <dizzyd> consider

15:32 <dizzyd> network split

15:32 <dizzyd> one side does a delete

15:32 <dizzyd> other doesn't

15:32 <dizzyd> with our current setup

15:32 <dizzyd> hum

15:32 <dizzyd> sticky

15:32 <seancribbs> dizzyd: right but that wouldn't read-repair-reap

15:32 <seancribbs> well, depending on when the split happens

15:32 <dizzyd> yes

15:33 <dizzyd> cardinal rule of race conditions: if they _can_ occur, they will occur in production.

15:33 <seancribbs> indeed

15:34 <seancribbs> sounds like we need some state machine QC on taht

15:34 <* seancribbs> waves his QC flag

15:34 <dizzyd> you'd really need to use QC and Pulse together

15:34 <dizzyd> to reproduce that one

15:34 <seancribbs> right

15:34 <dizzyd> since you'll have to monkey with scheduling

15:35 <seancribbs> i suppose it depends on how you mock the vnode master

15:35 <seancribbs> we got away with no statem on the get_fsm

15:35 <seancribbs> but this is multiple fsms i guess

15:35 <dizzyd> yup

15:35 <dizzyd> it'd be a tricky test…my fav kind :)
15:35 <seancribbs> yay
	11:49 <sfalcon> Using the Erlang pb client and a local dev cluster,
	I'm having a problem where a write (seems to) succeed, but then I get
	a notfound error when trying to fetch the doc.

	11:50 <sfalcon> I tried passing [{w, 3}] in my call to riakc_pb_socket:put,
	but I still get the notfounds ~7/10 tries.

	11:51 <drev1> same connection for both requests/

	11:51 <drev1> ?

	11:51 <sfalcon> drev1: yes.

	11:52 <drev1> what version of the pb client?

	11:53 <benblack> how long after writing are you reading and with what r?

	11:55 <sfalcon> This is pb client 0.2.0 but pulled from tip via rebar dep. The
	read request directly follows the write request. Using the default r value for the pb client.

	11:56 <sfalcon> drev1: pb client at 75:c728782d2790

	11:58 <benblack> sfalcon: if you insert a 1 second delay before the read do you see
	the same behavior?

	12:01 <sfalcon> benblack: timer:sleep(1000) does not eliminate the problem. I must be doing
	something dumb and am just not seeing it. There is a bit more relevant context...

	12:02 <drev1> pastie of the put request?

	12:02 <sfalcon> Running an integration test via eunit. I have two test functions.
	Each creates its own pb client using riakc_pb_socket:start().

	12:03 <sfalcon> Each test writes to same bucket, but different keys and deletes the doc
	at start of test.

	12:03 <sfalcon> If I comment out one test, the other passes. Only see the failures
	when they both run? WTF?

	12:06 <sfalcon> drev1: http://gist.github.com/540832

	12:11 <drev1> sfalcon: anyway you could provide code that reproduces the issue?

	12:17 <sfalcon> drev1: let me see what I can do. I'll ping you in a bit.

	12:19 <sfalcon> In the meantime, is there a way to get more verbose logging info?
	I'm looking at dev/dev1/log/erlang.log.X but not seeing much of interest.

	12:20 <seancribbs> sfalcon: sasl-error.log will have more detail about crashes

	12:21 <seancribbs> but often the standard error log will be informative enough

	12:21 <seancribbs> run_erl does some weird stuff with its log rotation

	12:22 <sfalcon> seancribbs: is there something more along the lines of access
	logging? Was hoping to see the PUTs and GETs go by.

	12:22 <sfalcon> (even though I'm using the pb interface)

	12:23 <seancribbs> sfalcon: you can get stats, that's the best we have right now

	12:50 <sfalcon> I see that I drev1 has left, but if someone else is available to take a look,
	I have a small repro case. Perhaps someone can see where I'm going wrong in my assumptions...

	12:50 <sfalcon> http://github.com/seth/testriak

	12:51 <sfalcon> If you clone that repo and do "make && make test" you should see it.
	Assumes a running riak cluster w/ pb at 8081

	12:55 <seancribbs> try closing the client at the end of each test, or keeping
	it in the process dictionary.

	12:56 <seancribbs> bunch of use are leaving now for the airport

	12:56 <seancribbs> s/use/us/

	12:56 <seancribbs> l8r

	12:56 <sfalcon> safe travels!

	13:12 <dizzyd> sfalcon: ping

	13:14 <dizzyd> gah, brb

	13:17 <sfalcon> dizzyd: k, I'm around...

	13:25 <dizzyd> sfalcon: investigating your issue

	13:28 <dizzyd> sfalcon: so make test should break?

	13:28 <dizzyd> it runs fine for me

	13:29 <sfalcon> dizzyd: even when you run it a few times? I get working/not-working

	13:30 <dizzyd> while (true); do make test;done

	13:30 <dizzyd> that ran a couple of hundred iterations before I ctl-c

	13:30 <sfalcon> dizzyd: I see more failures than successes, so something else
	must be different about my environment. Does the test code seem reasonable?

	13:31 <dizzyd> insofar as i can tell

	13:31 <sfalcon> dizzyd: you can tell me, I'm not a doctor :-P

	13:31 <dizzyd> ?

	13:31 <sfalcon> dizzyd: sorry my sense of humor doesn't always translate to irc

	13:32 <dizzyd> the only thing i would note

	13:32 <dizzyd> is that you're spinning up a lot of sockets

	13:32 <dizzyd> and never cleaning up

	13:32 <dizzyd> so maybe with a low enough ulimit

	13:32 <dizzyd> you might see some issues

	13:33 <sfalcon> Per a suggestion from seancribbs, I modified to reuse the client by
	stuffing it in the process dictionary. That didn't make a difference for me.

	13:33 <sfalcon> How would I cleanup a client?

	13:34 <dizzyd> i would do something like

	13:34 <dizzyd> stop(Pid) ->

	13:34 <dizzyd> Mref = erlang:monitor(process, Pid),

	13:34 <dizzyd> receive

	13:34 <dizzyd> {'DOWN', Mref, _, _, _} -> ok

	13:34 <dizzyd> end.

	13:34 <dizzyd> er

	13:34 <dizzyd> after Mref...

	13:34 <dizzyd> kill(Pid, normal).

	13:35 <* dizzyd> fails at coding apparently

	13:36 <sfalcon> ok, that's useful. Once the test ends and the vm running the
	test shuts down, those sockets are cleaned of course...

	13:36 <sfalcon> (I think)

	13:36 <dizzyd> sure

	13:37 <sfalcon> So the works for you and not for me seems to point to my local
	riak cluster install?

	13:37 <dizzyd> yup

	13:37 <dizzyd> i'm just running a single riak node

	13:37 <dizzyd> fwiw

	13:38 <sfalcon> I'm running on OS X SL, cloned riak via git, did make and make devlrel.
	Started dev1-3 and joined them. Client is always talking to dev1 node.

	13:38 <sfalcon> Can you try with a 3-node cluster?

	13:38 <dizzyd> sure

	13:39 <dizzyd> wait a sec

	13:39 <dizzyd> your riak_ip is set to 127.0.0.1

	13:39 <dizzyd> are you running dev1 on loopback?

	13:39 <dizzyd> n/m

	13:39 <dizzyd> i'm just confused

	13:40 acts_as_ joined

	13:41 <dizzyd> hah

	13:41 <dizzyd> tr: rt_3_test...failed

	13:41 <dizzyd> ::error:{assertMatch_failed,[{module,tr},

	13:41 <dizzyd> {line,66},

	13:41 <dizzyd> {expression,"get_stuff ( Pid , Id )"},

	13:41 <dizzyd> {expected,"{ stuff , [ [ x , y , z ] ] }"},

	13:41 <dizzyd> {value,{error,notfound}}]}

	13:41 <dizzyd> in function tr:'-rt_3_test/0-fun-0-'/1

	13:41 <dizzyd> =======================================================

	13:41 <dizzyd> Failed: 1. Skipped: 0. Passed: 2.

	13:41 <dizzyd> is that the error you see?

	13:41 <sfalcon> dizzyd: yup. I get that failure ~70% or more of the time, but if you run a
	few times you will see successes too.

	13:43 <dizzyd> hmmm

	13:43 <dizzyd> ok, i'm on it

	13:51 <sfalcon> dizzyd: Thanks for looking. Any ideas or things I can help investigate?

	13:55 <dizzyd> not at the moment

	13:59 <dizzyd> ah

	13:59 <dizzyd> i think i know what's going on

	13:59 <dizzyd> you specify a {w, 3} on puts

	14:00 <dizzyd> but not the same on delete

	14:00 <dizzyd> so, theoretically

	14:00 <dizzyd> the delete returns before all the nodes have acknowledge

	14:02 <dizzyd> yeah

	14:02 <dizzyd> welcome the land of inconsistency

	14:02 <dizzyd> the quorom on your deletes is less than that of your writes

	14:02 <dizzyd> if you set [{rw, 3}]

	14:02 <dizzyd> on your delete

	14:02 <dizzyd> things will work as expected

	14:03 <sfalcon> dizzyd: but if the delete didn't go through then wouldn't I get back data
	instead of notfound?

	14:04 <dizzyd> bah

	14:04 <* dizzyd> discards his theory

	14:05 <dizzyd> setting it helps

	14:05 <dizzyd> but not completely

	14:05 <dizzyd> something wonky is going on here

	14:05 <sfalcon> I started using just the defaults in the pb client. Then added the {w, 3}
	while trying to understand what was happening.

	14:05 <dizzyd> i'm afraid you've got the Friday Afternoon Dizzy™

	14:06 <sfalcon> 8-)

	14:08 <sfalcon> Wonder if I'm running into N=3, but not always onto 3 separate real nodes

	14:12 <sfalcon> be back in a few... going to grab some lunch

	15:07 <dizzyd> sfalcon: ping

	15:07 <sfalcon> dizzyd: pong

	15:07 <dizzyd> so...

	15:08 <dizzyd> what's happening is that the deletion is mildly async

	15:08 <dizzyd> thus, the return from delete CAN occur before the deletion is complete

	15:08 <dizzyd> and the put MAY run before the completion

	15:08 <dizzyd> so you wind up with

	15:08 <sfalcon> oooh, that's a good one :-P

	15:08 <dizzyd> delete (pt 1)/put/delete (pt2)/get

	15:09 <dizzyd> it is intentional though

	15:09 <dizzyd> but that doesn't make it Right

	15:09 <sfalcon> certainly... subtle

	15:10 <benblack> i think crucially hit the same issue

	15:12 <seancribbs> benblack: he did

	15:12 <seancribbs> we have it documented

	15:13 <dizzyd> seancribbs: do tell, where?

	15:14 <seancribbs> i would tell you if my 3G weren't dog slow in the airport. one min

	15:14 <dizzyd> heh, thx

	15:17 allen_ joined

	15:20 <sfalcon> dizzyd: thanks for helping to track that down. I can easily work
	around that as I was only doing deletes to have a clean test environment. Will be interested
	to read up on what's been documented and to understand more.

	15:20 seancribbs joined

	15:21 <seancribbs> dizzyd: https://issues.basho.com/show_bug.cgi?id=260

	15:21 <dizzyd> thx

	15:30 <* dizzyd> ponders

	15:30 <dizzyd> i would argue that a client should always submit a vclock when doing a delete

	15:30 <dizzyd> otherwise you get weird behaviour like this

	15:31 <seancribbs> dizzyd: +1

	15:31 <seancribbs> that is, if you want to care about it

	15:31 <seancribbs> right now the delete resource does it on behalf of the client

	15:31 <seancribbs> it has lots of race-condition potential

	15:32 <dizzyd> i think if you _don't_ care about it you lose data

	15:32 <dizzyd> and that's not acceptable

	15:32 <dizzyd> consider

	15:32 <dizzyd> network split

	15:32 <dizzyd> one side does a delete

	15:32 <dizzyd> other doesn't

	15:32 <dizzyd> with our current setup

	15:32 <dizzyd> hum

	15:32 <dizzyd> sticky

	15:32 <seancribbs> dizzyd: right but that wouldn't read-repair-reap

	15:32 <seancribbs> well, depending on when the split happens

	15:32 <dizzyd> yes

	15:33 <dizzyd> cardinal rule of race conditions: if they _can_ occur, they will occur in production.

	15:33 <seancribbs> indeed

	15:34 <seancribbs> sounds like we need some state machine QC on taht

	15:34 <* seancribbs> waves his QC flag

	15:34 <dizzyd> you'd really need to use QC and Pulse together

	15:34 <dizzyd> to reproduce that one

	15:34 <seancribbs> right

	15:34 <dizzyd> since you'll have to monkey with scheduling

	15:35 <seancribbs> i suppose it depends on how you mock the vnode master

	15:35 <seancribbs> we got away with no statem on the get_fsm

	15:35 <seancribbs> but this is multiple fsms i guess

	15:35 <dizzyd> yup

	15:35 <dizzyd> it'd be a tricky test…my fav kind :)
	15:35 <seancribbs> yay