Skip to content

Instantly share code, notes, and snippets.

@lbradstreet
Created March 29, 2016 11:23
Show Gist options
  • Save lbradstreet/1a19f7e9943c1864b1dc to your computer and use it in GitHub Desktop.
Save lbradstreet/1a19f7e9943c1864b1dc to your computer and use it in GitHub Desktop.
ABS Peer subscriber # rationale
Three tasks:
A -> B -> C
Two peers on B.
A publishes to stream subscribed by both B peers:
Barrier 1, m1, m2, m3, m4, m5, Barrier 2
B P1 reads:
Barrier 1, m1, m2, m3
B P2 reads:
Barrier 1, m4, m5, Barrier 2
B P1 reads:
Barrier 2
If only one barrier is going to be emitted per peer, then we need to synchronise when the new barrier is emitted to C.
i.e. we can’t have P1 process
Barrier 1, m1, m2, m3
then P2 emit
m4
Then P1 emit the barrier.
We need to make sure that both peers have processed everything between the barriers first. If we synchronise the outputs, then we only need one subscriber for the peer on C for the B task / host combination. We will still need a subscriber for every other host that has a B task - which means we still have a lot of subscribers if they’re spread over multiple hosts.
If we don’t synchronise, then both P1 and P2 will emit their own barrier 2 after the each respectively reach the barrier. With this solution, you will need a subscription for each upstream peer.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment