Skip to content

Instantly share code, notes, and snippets.

@mike529
Last active June 17, 2020 20:54
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mike529/82ff801366fccc096f6bab930d3ed463 to your computer and use it in GitHub Desktop.
Save mike529/82ff801366fccc096f6bab930d3ed463 to your computer and use it in GitHub Desktop.
Superspreader Model for COVID-19 Transmission

Questions arising from COVID Data:

In the course of the COVID-19 outbreak, there have been many features of the data which do not seem to have made sense.

In this article I will explain how the standard SEIR model used as the basis for most of the analysis fails to account for these features of the data. I will then propose a new model for understanding the disease's spread which more adequately explains these strange features.

Highly Different Spread:

Spain and Portugal started out with very different outbreaks, with Spain significantly more affected than Portugal. However after the lockdown they have converged to very similar growth rates. If there is some factor which makes Portugal less susceptible to the diseases, why has it not lead to the spread being lower while under lockdown?

Image

Divergent Spread:

Early on in the course of the outbreak New York State, Louisiana and Michigan had extremely similar death rates and growth However starting near the beginning of April they began diverging.

Image

Most accounts have struggled to explain this, either asserting that the difference is due to bad data quality, or to some difference in the lockdown and social distancing policies.

Sweden's Semi-Failed Experiment:

Famously Sweden decided to not embark on the lockdown that many of it's neighbors did. Supporters of this decision point to the fact that it's death rate has leveled off while critics have pointed out that it has stabilized at a rate nearly 10 times that of Norway.

Image

Based on the death rate, Sweden should be nowhere near herd immunity and still be exponentially growing. If the lack of lockdown lead to the higher death rates, why has it stabilized? If the lack of lockdown didn't matter (because people would social distance on their own) why did it stabilize at a rate 10x that of its neighbor?

Lockdown Timing:

An enduring mystery has also been that in many places the timing of the decline in infection rate does not line up particularly well with hard cutoffs for lockdown orders.

Image

In these three example states (which loosened lockdown on. May 1) we can see that the slope of the deaths starts to curve before lockdown should have had any impact and continues to curve or stay flat even after some of the lockdown restrictions have been lifted.

The Standard Model:

Most of the discussion and analysis around COVID-19 has used the standard model as a basis for understanding the spread of the disease and of methods for combatting it.

In the standard model there is essentially only one parameter that controls the course of the outbreak, R0. R0 represents how many other people we expect a single infected person to infect. As the disease spreads we can also track the RT which takes into account the fact that once you are infected you are immune to further infection.

As an example if the R0=2.0 that means the average person who is infected will infect two other people. This means that every round of transmission will double the number of people infected. Once 50% of the population has been infected, the RT will drop to 1.0 (two people are exposed to the disease but only one gets infected) and the number of people infected stabilizes.

We can see this behavior here.

Image

High Early Spread means High Peak Infection

The graph above helps demonstrate the core insight of the standard model. In the standard model, determining the precise rate of growth early on in the outbreak can reliably determine the total number of eventual infections. Any deviation from this path must be due to some intervention which affects the R0 parameter.

Image

Every Little Bit Helps:

The R0 is based on the average number of infection causing interactions between two people. In order to reduce the spread of the disease we need to reduce some percentage of these infections, and it doesn't matter how we do this. If our initial R0 is 2.0 then in order to halt the spread of the disease we need to eliminate 50% of these interactions. If we compare two interventions:

  • Uniform: Where the transmission rate for everybody is cut in half.
  • Deviated: For half of the sick people at random we reduce their transmission rate to 0.

We can see that there is no meaningful difference between the two approaches.

Image

Eternal Vigilance:

Until enough of the population has been affected, lockdowns and distancing need to be maintained almost indefinitely as any relaxation will quickly lead to the infection rates growing exponentially.

This need to be constantly on the lookout for disease spikes, is the basis for the Hammer and the Dance strategy

Here we see that a lockdown which is left whenever the infections drop below a certain level, still leads to a significant fraction of the population infected.

Image

A Superspreader Model:

I have been analyzing a somewhat more complex model, which I think explains the observed dynamics of the outbreak in a more satisfying manner.

In this model we treat the population as belonging to two types of people

  • General Population: Ordinary people
  • Superspreader Clusters: Groups of people who for various reasons spread the disease rapidly amongst themselves. Examples could include nursing homes, meat packing plants or prisons.

These dynamics have been discussed, but the focus has primarily been on protecting the members of the superspreader clusters. However the effect that these superspreader clusters have on the spread within the general population has not been as well explored.

Superspreader Transmission:

Even if we are only interested in the infection growth in the general population we still need to consider the indirect transmission paths where a member of the general population infects a superspreader cluster, and then that infection is retransmitted back into the general population.

If the direct transmission rate is greater than 1 then the disease will spread throughout the population just like in the standard model. However there are scenarios where the direct transmission rate is lower than 1 but the disease can still run rampant.

A helpful analogy may be to think of a fire which can either spread from house to house, or by the blaze setting off a fuel tank which then spreads the fire to other houses.

If the houses are packed closely enough together then the fire will spread out of control, however even if they are well spread out, if there are enough fuel tanks the fire can still spread in a similarly extreme manner.

Indirect Transmission Dynamics:

When the growth is driven by the indirect transmissions rather than the direct transmission, it has the following characteristics:

  • Exponential growth will stop when enough of the superspreader groups are infected, even if very little of the whole population is infected.
  • Since indirect transmission requires at least two jumps (from population -> superspreader and back), the rate of transmission will fall off more quickly.
  • Small differences in the transmission rate from the population to the superspreaders can cause large differences in the eventual size of the outbreak.
  • The initial stages of transmission are very uncertain, if a chain of transmissions happens to miss a superspreader cluster it will die out, but if it happens to hit one then it can trigger a major outbreak.

Mathematically we can represent the total spread within the general population as

RT = RT_{Direct} + RT_{Indirect}

RT_{Direct} = R0_{p, p} * Vuln_{p}

RT_{Indirect} = R0_{s,p} * R0_{p,s} * Vuln_p * Vuln_s * SpreaderFactor

SpreaderFactor = \frac{1}{1 - R0_{s,s} * Vuln_s}

How these factors impact the speed and eventual scope of the outbreak can be highly nonintuitive so I built a web ui to allow playing with the different parameters and seeing how the percentage of the population infected changes.

Link

Explanatory Power of the Superspreader Model:

The model that I am proposing is more complicated than the standard model, so Occam's Razor demands that it should explain the data we observe in a more satisfying manner. If we go back to the questions posed at the start of this article we can see how this superspreader model can better explain our observations.

Highly Different Initial Outbreaks:

When analyzed with the superspreader model large differences in initial outbreak intensity along with their eventual convergence become easier to understand.

  • A large variance in initial spread can be explained by a relatively small difference in the transmission rates from the population to the spreader groups and within the spreader groups.
  • The rate of transmission can drop much more rapidly then the standard model would predict as more and more spreader groups are infected. This means that we can see convergence at relatively low levels of infection.

For instance the difference between these two outbreaks is only one extra infection of a spreader group per 1000 infections. We can see the divergence between the two scenarios based on a small change. However the growth still converges at a relatively low percentage.

Image

Divergent Spread:

We have seen examples where different locations seem to be following similar outbreak paths before diverging. In the standard model, if the growth rate in two places is the same the paths can only diverge if the behavior changes between them. However in the superspreader model if the number of spreader groups is different then the exponential growth will last for different amounts of time and we will see divergence.

Image

Uniform Lockdown Results:

Looking at rt.live shows that their estimate of RT for almost all of the states has converged to somewhere between .8 and 1.1. When you consider that the lockdown restrictions and compliance in the different states has differed significantly this is surprising. One would expect that some states would lock down too little and continue to see the strong exponential spread or that other states would lock down extremely effectively and bring the infection rate much lower. Similarly the formal loosening of restrictions in several states has not, so far, shown up in these statistics.

If the observed transmission mostly occurs through the superspreader groups both of these questions can be resolved.

  • Since the RT is reduced based on how many superspreaders have been impacted, we can expect it to converge to below 1 much more quickly than in the standard model.
  • If the tightening and loosening of lockdown primarily impacts the direct transmission rate, then we would could see even severe lockdowns lead to relatively minor reductions in the transmission rate.

Here a lockdown which reduces the population spread in half still leads to a large scale outbreak because the spread between the superspreader groups and the rest of the population is not controlled.

Image

Alternative Theories:

Other research has also investigated the implications of the disease spread and proposed that individual superspreaders could be responsible. They point to research showing that a relatively small fraction of the population is responsible for a large number of infections. In order to compare this hypothesis I generated and compared four scenarios

  • Uniform: Standard R0 of 2.0
  • Deviated: 90% of the population will spread with an R0 of 1.0 the other 10% with an R0 of 10.0.
  • Susceptible: Same as the deviated scenario, but those with an R0 of 10.0 are also 10 times as likely to be infected.
  • Group Spread: A scenario where the superspreaders are represented by 200 groups of size 500 (10% of the population).

Image

Only the group spread and susceptible scenarios show a significant deviation from uniform spread. The group spread model however, is able to explain larger deviations in herd immunity with smaller assumptions about the difference in transmissibility.

Policy Implications:

If we suppose that the superspreader model accurately describes the course of the epidemic and that indirect transmission is the dominant factor, what are the implications for our behavior?

Protect and Isolate Superspreader Clusters:

In the superspreader model, preventing infections to superspreader clusters has an outsized impact on the course of the epidemic. In discussions of nursing homes the assumption is that stopping disease from entering is to protect the people in the nursing homes, however it is even more important for protecting the rest of the population. Some ways that we could do this are:

  • Pooled and/or randomized testing of known clusters: In many cases we have clusters that we know about and can pre-emptively devote resources to testing and catching outbreaks quickly. Sewage based testing is a promising avenue for detecting these problems earlier.
  • Retain and reinforce the protections separating these clusters from the general population.
  • Community based isolation on positive test results: When we detect an outbreak spreading within one of these clusters, immediately quarantining and isolating the entire community is extremely important. By contrast reducing the spread within the community has a much smaller impact.

Avoid Large Gatherings:

Large gatherings, especially ones which last for several days like conventions can cause the worst explosions of the outbreak. Not only can they cause a large number of infections within the group, but since the group is transient the infections are spread to many other people. Determining which types of large gathering are particularly dangerous remains to be seen, but we will probably need to be especially cautious.

Reopen - With A Plan:

To the extent that indirect transmission plays a larger role in the disease's spread, we may have more room to start to leave lockdown than the raw transmission rates would suggest.

However the outsized impact these superspreader clusters can have mean that there is also need for greater caution in certain places. As places reopen some new clusters will emerge and allow outbreaks to spread rapidly again. A strong focus needs to be placed on identifying the most dangerous of these clusters and developing methods to allow them to be isolated quickly if necessary.

Code Samples:

The code used to generate the graphs on this page can be found here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment