Skip to content

Instantly share code, notes, and snippets.

@richardgill
Last active September 13, 2020 20:42
Show Gist options
  • Save richardgill/657c97da19b6425bd50548cde6a07459 to your computer and use it in GitHub Desktop.
Save richardgill/657c97da19b6425bd50548cde6a07459 to your computer and use it in GitHub Desktop.
Micro Covid

Hi Micro Covid team!

First of all - let me say that I think Micro Covid is amazing and your work is fantastic!

I helped build a similar tool https://shouldigooutnow.com/, which is a little bit simpler than Micro Covid. We built it over the last couple of weeks without any knowledge that Micro Covid existed. So, reading your whitepaper and looking at the code has been really interesting. You might find it interesting to read about our methodology - we've done a few things differently.

Our code is open source, here is our core risk model, we've tried to keep it as short and clear as possible.

I love what you're doing (which is why we did something similar!), I think your project has a greater chance of success than ours because it's easier to use, has more features and is friendlier, so I wanted to share some constructive feedback on areas to think about / that could be improved!

Feedback is hard. So, I'll begin by listing the things I love about Micro Covid!

Things we love

  • It's relatable and the work you've done to help normal people understand is brilliant. (Particularly the results section at the bottom!).
  • Your Whitepaper and the Q&A is the best one-stop, easily understandable resource about Covid spread I've seen.
  • Scenarios are fantastic - we thought about doing this, cool to see it done!
  • Having all the global prevalence data is excellent!
  • Inferring prevalence from the crappy stats which exist for the US is amazing.

Feedback

Caveat: I'm not a statistician, thankfully Irfan, who worked on this with me bailed me out! But we're both software engineers, not epidemiologists. So keep that in mind.

Each Person Risk * Person Count

The way you calculate person risk involves multiplying the risk a person has covid by the number of people who have covid. But this can (and does) create probabilities > 1.

e.g. average person risk is 1000 microcovids (0.001 probability) and you spend time near 1001 people 0.001 * 1001 = 1.001.

But probabilities can never be > 1.

I'm not sure if this matters in practice, but I think it's worth considering.

When we built our model we had a lot of Math.max(prob, 1) to cap probabilities at 1 - so we're very familiar with this problem!

To solve this we've used a binomial probability mass function. We explain it in our methodology. A good client-side Javascript implementation that deals with large numbers was hard to find. So if you need this function - check out our code.

Activity Risk * Duration

The way you calculate activity risk multiplies a transmission risk by a duration. Which could also result in a probability > 1, but you cap it at MAX_ACTIVITY_RISK=0.48.

We used exponential decay for this and explain it in our methodology.

This might allow you to remove the cap if you wanted to. Although I'm not sure if you want to.

In practice I'm not sure how much difference this would make.

Use randomized sample prevalence data if available

Randomly sampling the population seems like the best way to figure out the average probability someone in the population has Covid-19. Many people who get tested, get tested because they have reason to believe they might have Covid-19, which biases things.

You've brilliantly adjusted for this! However some countries / regions (sadly not the US) have randomised sample testing. Here's that data collected in England by the UK government.

You could use this data directly instead if it's available. If you haven't already, you could also use this data to calibrate your adjustments from the raw testing data. For the UK your adjustments come really close to the randomly sampled data - which is fantastic!

Micro Covids and units

A pretty big difference is you created a unit 1 microCOVID = 1/1000000 = 0.000001. This makes reviewing your code a bit harder because you use Micro Covids in the intermediate steps rather than just converting at the end.

I think overall I'm pro Micro Covid units. Our model produces very small %'s a lot of the time and has a tendency to scare people if the output is too large.

In other places you use units like: 123-in-a-million. The UK government uses a format we personally found easier to understand: 1 in 8130 or rounding a bit 1 in 8000. The code for that is here.

Thanks for building this, I'm open to helping with some Pull Requests to make some of these changes or other changes if you could use some extra help.

This kind of tool makes a lot of sense right now and I'm surprised there isn't more interest in it yet.

Thanks,

Richard

@beshaya
Copy link

beshaya commented Sep 13, 2020

Thanks for the detailed feedback!

Some thoughts on your thoughts:

Each Person Risk * Person Count

Person risk is an expected value as opposed to a probability. If each person has a 1/1000 chance of having covid and there are 2000 people, the expected number of people with Covid is 2. We believe that your chances of catching covid scales with the number of people who have covid - having 2 people in the room with covid doubles the amount of e.g. aerosols and droplets released into the air.

In other words, you could think of microCovid's as 1 in a million chance of having covid per person; the cap on microCovid in a social situation is # of people.

We use a small-number approximation that allows us to add probabilities (a brilliant realization by @catherio and @oremanj). This breaks down around 10% risk, which is when switching to properly multiplying probabilities is necessary. However, we are fine not reporting numbers higher than 10% because that is already an incredible amount of risk to take in one go.

Activity Risk * Duration

This is a really good point; transmission risk probably doesn't scale linearly with time. We have no data to suggest whether it is more or less than linear, though! It's possible that short interactions are actually safer than a linear model suggests (you simply don't have enough time to get an infectious dose in a short interaction). On the other hand, it clearly can't scale linearly forever (you'd get probabilities above 1, which is, as you note, nonsensical), but even before that there is strong research showing that transmission rates for prolonged close-contact are below 1. We use 48% as the cap because this is the published transmission rate between spouses (who we figure are sleeping together and sharing space in close proximity for many hours a day). In our own community, we cap non-intimate interactions at 30% (published transmission rates for household members).

For the UK your adjustments come really close to the randomly sampled data - which is fantastic!

Thanks for this sanity check! Prevalence rates are the biggest uncertainty we have in our data. We're confident to about a factor of 3x in all our other numbers, but poor testing in the States makes us nervous. Seeing that our correction works in the UK is fantastic.

I think you're right in that using data from random trials would improve our data quality. However, as our team is based in the United States, I think pulling in data from the UK is unlikely to be a priority for us. If you'd like to open a PR to pull this data we'd be happy to review it and bring it in 😄

Micro Covids and units

Glad you like the units! We're really happy with how that worked out. Your point of using factors of 1e6 all over the codebase for intermediate steps is super valid; definitely a place for a potential refactor (or at least additional documentation) as we'd love to make it easy for new people to hop into the project.

Again, thanks so much for the detailed thoughts on our project. We're super flattered that you've taken such a deep look at our project, and it's an absolute privilege to swap notes with other folks who have dug deep into the numerics of the pandemic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment