fig.3 in upaxos paper is not a general situation as setting {leader, a1} = qIe = qIe+1 and {leader, a2} = qIIe = qIIe+1.
lets split them up and say N1=6, qIe={1, 2, 3, 4, 5}, qIIe={5, 6}, Leader=5
qIe
1 2 3 4
5 (L)
6
qIIe
and then we move from e to e+1, say N2=3, qIe+1={5, 7}, qIIe+1={5, 8}, we have qIe and qIIe+1 intersect.
qIe
1 2 3 4 7 qIe+1
5
6 8 qIIe+1
qIIe
what happen when cluster entering split mode? according to the upaxos paper:
"While a change from era e to e + 1 is in progress some instances may use the interim configuration <QIe, QIIe+1>"
leaer 5 should write client commands into qIIe+1 quorums, which would be {8}, even the e(b) of node 8 is not known yet.
after fix reconfiguration at slot s+1, leader 5 crash before it send a b` prepare to qIe+1
then {1, 2, 3, 4, 6} would elect a leader , lets say 6, then new leader 6 would join {7, 8}, and would find all client commands.
what if nodes in q(I, e+1) going to elect a leader when leader down?
lets say N2=4, qIe+1={5, 7, 8}, qIIe+1={5, 9}
qIe
1 2 3 4 7, 8 qIe+1
5
6 9 qIIe+1
qIIe
after fix reconfiguration at slot s+1, leader 5 crash before it send a b` prepare request to qIe+1
nodes {7, 8, 9} could elect a new leader because the amount of nodes in e+1 satisfy the required amount of q(I, e+1), which is 3.
and in other side, {1, 2, 3, 4, 6} would elect 6 to be leader, and 6 would see the reconfiguration in slot s+1, and join {7, 8, 9}, finally we got two leaders
and you says:
"Once the leader knows the new era is fixed at slot s+1 it needs to upgrade to a fresh ballot number in the new era. It does this by having a majority of nodes promise to a new ballot number in the new era."
and
"It does this by acting last within the prepare quorum. It defers from making a promise to its own new ballot number until it knows that sufficient other nodes have also made promises to give it the 'casting vote.'"
in this situation, majority does not helps, and how to construct the "casting vote"?
Hi there, thanks for taking an interest in UPaxos, and thanks so much for the improved formatting here. I'll see what I can do to help.
Your first question doesn't really typecheck:
These QIe things are sets of sets of nodes but you have declared them to be sets of nodes. The resilience of Paxos comes from having the freedom to use lots of different subsets of the nodes for each phase. I can think of a couple of few different interpretations but the meaning of your question depends a great deal on which fix is the one you intended. Could you adjust this to remove the ambiguity? For instance, you may mean
However this isn't valid because it does not satisfy QIe ⌢ QIIe since e.g. {1,2,3} ∩ {5,6} = ∅.
For your second question:
Again the same ambiguity, but there's another misunderstanding here too. It's not the amount of nodes that matters, the identities of the nodes are important. Some sets of 3 nodes may be a quorum while other sets of 3 nodes are not. I think I'll be able to answer more clearly if you adjust your question so that all the Qs are sets of sets of nodes.
This is described in detail on page 6 of the paper. In particular note that a node with a casting vote does not always exist, especially if some nodes have failed or you have chosen the quorums badly. If there's no node with a casting vote then the algorithm still makes progress, but it will have to stop accepting proposals for the short period of time until the reconfiguration is complete.