Skip to content

Instantly share code, notes, and snippets.

@jeancroy
Last active December 12, 2023 20:12
Show Gist options
  • Save jeancroy/020c0a7990d636368241d9f077ce0d87 to your computer and use it in GitHub Desktop.
Save jeancroy/020c0a7990d636368241d9f077ce0d87 to your computer and use it in GitHub Desktop.
Basic Statistic
P(Success) = 1/N
Add a corrector, Make it difference between theorical probability and observation
P(Success) = 1/N + Corrector
P(Success) = 1/N + (1/N - SucessCount / TotalCount )
Adjust for initial phase:
Prevent Division per zero & reduce the strength for small count
P(Success) = 1/N + ( 1/N - ( SucessCount + 1) / ( TotalCount + N) )
= 2/N - ( SucessCount + 1) / ( TotalCount + N)
Verify behavior when TotalCount = 0 :
( 1/N - (0 + 1) / ( 0 + N) ) = 1/N - 1/N, no corrector active.
Verify behavior when SucessCount/TotalCount is a multiple of 1/N, SucessCount/TotalCount = k/(k*N)
( SucessCount + 1) / ( TotalCount + N) =
( k + 1) / ( k * N + N) =
1 * (k + 1) / ( N * (k + 1) ) =
1/N, => no corrector active.
---------------------------------------------
Alternative, using exponential window.
P(S) = 1/N + Corrector
P(S) = 1/N + (1/N - SucessCount / TotalCount )
P(S) = 1/N + (1/N - WindowedSuccessRate )
Note that SucessCount / TotalCount = SucessRate
Init WindowedSuccessRate = 1/N
Update WindowedSuccessRate += K*( I(S) - WindowedSuccessRate), Where I(S) is 1 on success, 0 on fail
Choose K so after 10*N samples, influence of initial value is 1% ?
@jeancroy
Copy link
Author

jeancroy commented Oct 15, 2020

Generate success with target probability = 1/N.
Use a modified target so it can tend to 0 or 1 as the observed result get far.
(reduce variance)

Only do positive correction:
p = target+ I(error>0) * (1 - target) * tanh( tau*error )

Both side correction:
p = target + ( I(error>0) - target) * tanh( tau*abs(error) )

// Init
state. avg = target 
// Loop
avg = state.avg


// Compute error in term of relative N off, where N~1/target.
error = 1.0 - target/avg

// adjust range
range =  ( error > 0 ? 1.0 - target : target )

// Update adjustment
adjusted  = target + range * tanh(3*error)

// Is a success ?
success = (Math.random()<=adjusted )

// Moving average of 1 and 0
indicator =  success ? 1.0 : 0.0
K = 1.0 - pow( 0.01, 0.1*target ) // could be cached, depend only on target.
avg += K*( indicator - avg) 

state.avg = avg;

return success 

tanh(x) = ( 1.0 - 2.0 / ( 1.0 + exp( 2.0*x) ) )

choose K:
How many failure need to be observed so the inputs
1 0 0 0 0 0 ... 0
become small enough ?

repetition => beta*N, N =1/target.
small => alpha

 (1-K)^(beta*N) == alpha
 K = 1-alpha^(1/(beta*N))
 K = 1-alpha^(target/beta)
 K = 1 - 0.01^(target/10)

choose tau a number [1,6]
tau = 1 or 3 ?

// Compute error in term of relative N off, where N~1/target.
// ( 1/target - 1/avg ) / (1/target) = target* ( 1/target - 1/avg ) = 1 - target/avg
// as avg get to 0, error become large and P(sucess) get close to 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment