Skip to content

Instantly share code, notes, and snippets.

@teju85
Last active April 5, 2020 22:27
Show Gist options
  • Save teju85/0c9a16b8ccf31d6155211d0e276672a3 to your computer and use it in GitHub Desktop.
Save teju85/0c9a16b8ccf31d6155211d0e276672a3 to your computer and use it in GitHub Desktop.
Simulate the number of tests needed if we decide to "pool" the samples
// Compilation and running:
// g++ -std=c++11 covid-pooled-testing.cpp && ./a.out
#include <cstdio>
#include <stdint.h>
#include <cmath>
#include <cstdlib>
#include <algorithm>
/**
* @brief Computes number of ones in the input number
* @param num the input number
* @return the number of ones
*/
int __popcnt64(uint64_t num) {
int cnt = 0;
while (num > 0) {
cnt += num & 0x1;
num >>= 1;
}
return cnt;
}
/**
* @brief Computes the probability of the occurence of a particular combination of
* bits in the input number
* @param num input number
* @param prob probability of a bit having a value 1
* @param poolSize max number of bits to consider in the input number
* @return the probability
*
* @note this assumes IID across the bits!
*/
double probability(uint64_t num, double prob, int poolSize) {
auto numPositive = __popcnt64(num);
return std::pow(prob, numPositive) *
std::pow(1.0 - prob, poolSize - numPositive);
}
/**
* @brief total number of tests needed to ascertain the results of each
* individual for a given infection pattern. ie. i-th bit in the input
* number having a value of 1 means that i-th individual is positive.
* @param num input number containing the infection pattern
* @param poolSize number of individuals in the pool to be tested for
* @return total number of tests needed
*/
int numTestsNeeded(uint64_t num, int poolSize) {
if (num == 0 || (num == 1 && poolSize == 1)) return 1;
int ps = poolSize >> 1;
uint64_t psMask = (1ULL << ps) - 1ULL;
auto left = (num >> ps) & psMask;
auto right = num & psMask;
return numTestsNeeded(left, ps) + numTestsNeeded(right, ps) + 1;
}
/**
* @brief average number of tests needed to ascertain the results of a pool
* where averaging is done across the whole population and also across
* whole possibility of infection patterns in a pool. For a pool of size
* 4, there will be 2^4 patterns.
* @param prob probability of a person getting infected
* @param poolSize number of individuals in the pool to be tested for
* @return average number of tests needed per pool
*/
double averageTestsNeeded(double prob, int poolSize) {
auto totalTests = 0.0;
auto cnt = 1ULL << poolSize;
for (uint64_t i = 0LL; i < cnt; ++i) {
auto p = probability(i, prob, poolSize);
auto nTests = numTestsNeeded(i, poolSize);
totalTests += nTests * p;
}
return totalTests;
}
/**
* @brief average number of tests needed to ascertain the results of a pool
* when it is given that atleast one of the member in the pool is
* positive.
* @param prob probability of a person getting infected
* @param poolSize number of individuals in the pool to be tested for
* @return average number of tests needed per pool
*/
double averageTestsNeededIfAtleastOnePositive(double prob, int poolSize) {
auto totalTests = 0.0, totalProb = 0.0;
auto cnt = 1ULL << poolSize;
for (uint64_t i = 1LL; i < cnt; ++i) {
auto p = probability(i, prob, poolSize);
auto nTests = numTestsNeeded(i, poolSize);
totalProb += p;
totalTests += nTests * p;
}
return totalTests / totalProb;
}
/**
* @brief total number of *sequence* of tests needed to ascertain the results
* of each individual for a given infection pattern, when it is given
* that atleast one of them is positive.
* @param num input number containing the infection pattern
* @param poolSize number of individuals in the pool to be tested for
* @return total number of steps needed
*/
int numStepsNeeded(uint64_t num, int poolSize) {
if (num == 0 || (num == 1 && poolSize == 1)) return 1;
int ps = poolSize >> 1;
uint64_t psMask = (1ULL << ps) - 1ULL;
auto left = (num >> ps) & psMask;
auto right = num & psMask;
return std::max(numStepsNeeded(left, ps), numStepsNeeded(right, ps)) + 1;
}
/**
* @brief average number of *sequence* of tests needed to ascertain the
* results of a pool when it is given that atleast one of the member
* in the pool is positive.
* @param prob probability of a person getting infected
* @param poolSize number of individuals in the pool to be tested for
* @return average number of steps needed per pool
*/
double averageStepsNeededIfAtleastOnePositive(double prob, int poolSize) {
auto totalSteps = 0.0, totalProb = 0.0;
auto cnt = 1ULL << poolSize;
for (uint64_t i = 1LL; i < cnt; ++i) {
auto p = probability(i, prob, poolSize);
auto nTests = numStepsNeeded(i, poolSize);
totalProb += p;
totalSteps += nTests * p;
}
return totalSteps / totalProb;
}
constexpr auto TotalPopulation = 8000000000ULL;
void checkForOneCase(uint64_t totalInfected) {
const auto infectionProbability = totalInfected * 1.0 / TotalPopulation;
std::printf("Infection probability = %lf\n", infectionProbability);
for (int ps = 2; ps <= 16; ps <<= 1) {
auto nPosTests = averageTestsNeededIfAtleastOnePositive(
infectionProbability, ps);
auto nPosSteps = averageStepsNeededIfAtleastOnePositive(
infectionProbability, ps);
std::printf(" PoolSize=%2d avgTestsNeeded=%le avgPositiveTests=%le"
" avgStepsForPositives=%le\n", ps,
averageTestsNeeded(infectionProbability, ps), nPosTests,
nPosSteps);
}
}
int main(int argc, char** argv) {
for (uint64_t infected = 1000000ULL; infected <= 100000000ULL; infected += 10000000ULL)
checkForOneCase(infected);
return 0;
}
@chaithyagr
Copy link

@teju85 ,
As an extension to above, here is a python code for cases if we split the population to J subsegments: https://github.com/chaithyagr/sample_pool_testing/blob/master/split_to_j_subsets.ipynb

However, it's quite slow for now given all the loop and lack of vectorization. It follows quite similar principles as your code except, one major difference is that due to the speed, I rely on monte carlo simulation, which isnt exactly a good way to get accurate results.
I have added ways to add infection probablity per individual and also a way to kind off sort samples based on these probabilities.

Thoughts:

  1. Currently the code still just does hierachial method, or direct testing, based on how high J is, basically we could extend this to an array of J's at each level
  2. In real world we would never know the so called infection probablity of each indivtual, so the best way forward is to group spatially with respect to families and communities. For this I think https://github.com/3b1b/manim/blob/shaders/from_3b1b/active/sir.py is a reallly good simulation of how covid spreads, and maybe we can add some way to pool test into this based on coordinates. I would really recomment you to watch his visualizations, he is one of my favorite math simulator guys if thats a thing :P : https://www.youtube.com/watch?v=gxAaO2rsdIs&t=1185s . He tests literally all kinds of possibilies, with/out social distancing, if there are common points in communities, borders and barriers etc..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment