Configuration parameters for a topic:
// P1
TimeInMeshWeight float64
TimeInMeshQuantum time.Duration
TimeInMeshCap float64
// P2
FirstMessageDeliveriesWeight, FirstMessageDeliveriesDecay float64
FirstMessageDeliveriesCap float64
// P3
MeshMessageDeliveriesWeight, MeshMessageDeliveriesDecay float64
MeshMessageDeliveriesCap, MeshMessageDeliveriesThreshold float64
MeshMessageDeliveriesWindow, MeshMessageDeliveriesActivation time.Duration
// P3b
MeshFailurePenaltyWeight, MeshFailurePenaltyDecay float64
// P4
InvalidMessageDeliveriesWeight, InvalidMessageDeliveriesDecay float64
Counters:
meshTime time.Duration
firstMessageDeliveries float64
meshMessageDeliveries float64
meshFailurePenalty float64
invalidMessageDeliveries float64
We consider a DecayInterval
of 1s and a DecayToZero
of 0.01 (1 part in 100).
If you use a longer decay interval you'll have to adjust values accordingly.
Let's consider an aggregate message rate of R = 120m/s
.
Peers in a mesh with D=6
members have an expected first message delivery rate of Rₚ = 20m/s
.
If the mess has D_hi=12
members, we have an expected first message delivery rate of Rₚ' = 10m/s
.
We'll be conservative consider 10m/s for the expected mesh message delivery rate.
For the timeInMesh
we use a quantum of 1s and cap it to 1hr of mesh time. So:
TimeInMeshQuantum = 1s
TimeInMeshCap = 3600
The expected value of the firstMessageDeliveries
counter, once in stead state, will
be ΣₙFirstMessageDeliveriesDecayⁿRₚ
.
Let's say we want a first message delivery to decay in 2min; that's 120 DecayInterval
s, so the
firstMessageDeliveries
counter will have to decay 1 part in 120; that's .0083....
so we set
FirstMessageDeliveriesDecay = 0.9916
At steady state, the expected value of firstMessageDeliveries
is ~75 * Rₚ
, which gives us
a range of 750 to 1500. We set a cap of 1500:
FirstMessageDeliveriesCap = 1500
The expected value of meshMessageDeliveries
is similarly ΣₙMeshMessageDeliveriesDecayⁿRₚ
.
We want the counter to decay within the activation window so that we get an EWMA average of
mesh message deliveries. For a responsive mesh, an activation of 30s sounds reasonable, which
necessitates a decay of 1 part in 30; that's .03...
, so:
MeshMessageDeliveriesActivation = 30s
MeshMessageDeliveriesDecay = 0.97
At steady state, the expected value of meshMessageDeliveries
is ~20 * Rₚ
, which gives us
a range of 200 to 400. We'll set the threshold to 200 with a cap of 400:
MeshMessageDeliveriesThreshold = 200
MeshMessageDeliveriesCap = 400
At this point we are in position to select weights.
We calibrate the weights to be such that falling to (say) 50% of mesh delivery threshold cancels
out the first message deliveries and time in mesh and drops the score to 0 so that the peer is pruned
for the mesh if it drops even below.
So at steady state, we would have a deficit of 100, with a squared value of 1000 and a firstMessageDelivers
counter at 75 * 5 = 375
(half the conservative expected rate)
This prescribes:
1000 * MeshMessageDeliveriesWeight + 375 * FirstMessageDeliveriesWeight + TimeInMeshWeight * 3600 = 0
Let's normalize the time in mesh to 1, which gives us
TimeInMeshWeight = 1/3660 = 0.0027
1000 * MeshMessageDeliveriesWeight + 375 * FirstMessageDeliveriesWeight + 1 = 0
Now let's also consider the maximum value of the mesh message delivery deficit: that's 200²=4000. If we bound the negative score from mesh failure to say -1000, that gives us
MeshMessageDeliveriesWeight = -0.25
FirstMessageDeliveriesWeight = 249/375 = 0.664
We'll keep the same weight for the sticky P3b penalty, and decay that in 5min; so:
MeshFailurePenaltyWeight = -0.25
MeshFailurePenaltyDecay = 0.997
At this point we have a good constraint for the gossipThreshold
: it has to be <= -1000 so that a peer
booted from the mesh and stuck with the sticky penalty still receives gossip in the absence of other
misbehaviour.
We also need to pick a weight and decay for invalid messages; if we decay an invalid message in 30min and we want this to cancel out the cap of first message deliveries with (say) 10 invalid messages. That gives us:
InvalidMessageDeliveriesDecay = .9994
10 * InvalidMessageDeliveriesWeight + 1500 * 0.664 =>
InvalidMessageDeliveriesWeight = -99.6
Here we summarize the weights we have configured for our topic with an expected rate of 120 message/s:
// P1
TimeInMeshWeight = 0.0027
TimeInMeshQuantum = time.Second
TimeInMeshCap = 3600
// P2
FirstMessageDeliveriesWeight = 0.664
FirstMessageDeliveriesDecay = 0.9916
FirstMessageDeliveriesCap = 1500
// P3
MeshMessageDeliveriesWeight = -0.25
MeshMessageDeliveriesDecay = 0.97
MeshMessageDeliveriesCap = 400
MeshMessageDeliveriesThreshold = 200
MeshMessageDeliveriesActivation = 30 * time.Second
// P3b
MeshFailurePenaltyWeight = -0.25
MeshFailurePenaltyDecay = 0.997
// P4
InvalidMessageDeliveriesWeight = -99
InvalidMessageDeliveriesDecay = .9994