Skip to content

Instantly share code, notes, and snippets.

@XMPPwocky
Created August 9, 2021 21:37
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save XMPPwocky/b23b53ba5574fcc1fdf7aaa7c6e485a7 to your computer and use it in GitHub Desktop.
Save XMPPwocky/b23b53ba5574fcc1fdf7aaa7c6e485a7 to your computer and use it in GitHub Desktop.
```python
# ye olde imports
import math, random
```
Imagine a virus which is, somehow, a single protein with only 3 "amino acids" (actually just uppercase letters here, oh well)
Currently, only one type is circulating, which is "XYZ".
```python
EXISTING_TYPE = "XYZ"
```
But there's a possible variant, the Super Nasty Variant, "BAD", which escapes antibodies and is hard to even create good antibodies against (meaning even a vaccine targeted specifically at it has trouble):
```python
SUPER_NASTY_TYPE = "BAD"
```
Now we'll run a comically oversimplified simulation of evolution: starting from a population of 100 copies of the existing virus (XYZ)...
```python
POPULATION_SIZE = 100
def make_starting_population(): return [EXISTING_TYPE] * POPULATION_SIZE
```
Each generation, every virus makes a copy of itself (to form a population twice as big), possibly mutating in the process.
Then, each virus may die (in this case, be neutralized by antibodies or whatever); the odds of survival each generation is the virus's fitness.
Note that the steady-state here happens for viruses with fitness of 0.5 (they make a copy of themselves each generation, but on average the original or the copy dies).
```python
def evolve(starting_population, fitness):
population = starting_population
new_population = [virus for virus in population]
# each virus makes a copy of itself, maybe mutating...
for virus in population:
new_population.append(maybe_mutate(virus))
# but then some maybe die, based on their fitness:
population = []
for virus in new_population:
if fitness(virus) > random.random():
# virus survived
population.append(virus)
if not population:
population = [EXISTING_TYPE] # goofy hack so that the virus never completely dies out
return population
def geometric_mean(x):
if not x: return float("nan")
return math.prod(x) / len(x)
MAX_GENERATIONS = 2500 # timeout to keep things sensible
def run_experiment(fitness_fn, quiet=True):
population = make_starting_population()
if not quiet: print("Generation\tPop. size\tAvg. fitness\tMax. fitness")
for generation in range(MAX_GENERATIONS):
if not population:
if not quiet: print("Virus died out...")
break
if generation % 100 == 0:
fitnesses = [fitness_fn(virus) for virus in population]
avg_fitness = geometric_mean(fitnesses)
max_fitness = max(fitnesses)
if not quiet: print("{:5}\t\t{:8}\t{:5.4f}\t\t{:5.4f}".format(generation, len(population), avg_fitness, max_fitness))
population = evolve(population, fitness_fn)
if SUPER_NASTY_TYPE in population:
if not quiet: print("Super Nasty Variant evolved in {} generations!".format(generation))
return generation
```
In this model, "mutations" are just random amino acid substitutions:
```python
ERROR_RATE = 1e-3 # 1/1000 chance per amino acid to replace it with a random one
def random_letter():
return chr(ord("A")+random.randint(0, 26))
def maybe_mutate(virus):
# random substitutions
newvirus = virus
for i in range(len(newvirus)):
if random.random() < ERROR_RATE:
# mutate this position
newvirus = newvirus[:i] + random_letter() + newvirus[i+1:]
return newvirus
```
If each of the individual 3 substitutions (X->B, Y->A, Z->D) required to get from the existing type to the Super Nasty Type confers a small advantage on its own...
```python
SINGLE_SUBSTITUTION_FITNESS_GAIN = 1.05 # 5% advantage per substitution
BASE_FITNESS = 0.5 # steady state...
def fitness_normal(virus):
num_mutations = 0
if virus[0] == "B": num_mutations += 1
if virus[1] == "A": num_mutations += 1
if virus[2] == "D": num_mutations += 1
return BASE_FITNESS * pow(SINGLE_SUBSTITUTION_FITNESS_GAIN, num_mutations)
```
then within a few thousand generations, the Super Nasty Variant usually shows up:
```python
for experiment in range(20):
generations_to_nasty = run_experiment(fitness_normal, quiet=True)
if generations_to_nasty is not None:
print("Experiment #{}: Super Nasty Variant evolved in {} generations".format(experiment, generations_to_nasty))
else:
print("Experiment #{}: Super Nasty Variant never evolved".format(experiment))
```
Experiment #0: Super Nasty Variant never evolved
Experiment #1: Super Nasty Variant evolved in 2445 generations
Experiment #2: Super Nasty Variant evolved in 701 generations
Experiment #3: Super Nasty Variant never evolved
Experiment #4: Super Nasty Variant evolved in 331 generations
Experiment #5: Super Nasty Variant evolved in 722 generations
Experiment #6: Super Nasty Variant evolved in 472 generations
Experiment #7: Super Nasty Variant evolved in 2227 generations
Experiment #8: Super Nasty Variant never evolved
Experiment #9: Super Nasty Variant evolved in 627 generations
Experiment #10: Super Nasty Variant evolved in 516 generations
Experiment #11: Super Nasty Variant evolved in 195 generations
Experiment #12: Super Nasty Variant evolved in 693 generations
Experiment #13: Super Nasty Variant evolved in 300 generations
Experiment #14: Super Nasty Variant never evolved
Experiment #15: Super Nasty Variant evolved in 1506 generations
Experiment #16: Super Nasty Variant evolved in 837 generations
Experiment #17: Super Nasty Variant evolved in 643 generations
Experiment #18: Super Nasty Variant evolved in 302 generations
Experiment #19: Super Nasty Variant never evolved
But if all 3 mutations must occur together to get a fitness gain...
```python
def fitness_new(virus):
num_mutations = 0
if virus[0] == "B": num_mutations += 1
if virus[1] == "A": num_mutations += 1
if virus[2] == "D": num_mutations += 1
# now, without all 3 mutations, it's no better than no mutations:
if num_mutations < 2: num_mutations = 0
return BASE_FITNESS * pow(SINGLE_SUBSTITUTION_FITNESS_GAIN, num_mutations)
```
... it's much rarer (and slower) for the variant to evolve in the first place, even though it's just as bad when it does show up.
```python
for experiment in range(20):
generations_to_nasty = run_experiment(fitness_new, quiet=True)
if generations_to_nasty is not None:
print("Experiment #{}: Super Nasty Variant evolved in {} generations".format(experiment, generations_to_nasty))
else:
print("Experiment #{}: Super Nasty Variant never evolved".format(experiment))
```
Experiment #0: Super Nasty Variant never evolved
Experiment #1: Super Nasty Variant never evolved
Experiment #2: Super Nasty Variant never evolved
Experiment #3: Super Nasty Variant never evolved
Experiment #4: Super Nasty Variant never evolved
Experiment #5: Super Nasty Variant never evolved
Experiment #6: Super Nasty Variant never evolved
Experiment #7: Super Nasty Variant never evolved
Experiment #8: Super Nasty Variant never evolved
Experiment #9: Super Nasty Variant never evolved
Experiment #10: Super Nasty Variant never evolved
Experiment #11: Super Nasty Variant never evolved
Experiment #12: Super Nasty Variant never evolved
Experiment #13: Super Nasty Variant never evolved
Experiment #14: Super Nasty Variant never evolved
Experiment #15: Super Nasty Variant never evolved
Experiment #16: Super Nasty Variant never evolved
Experiment #17: Super Nasty Variant never evolved
Experiment #18: Super Nasty Variant evolved in 943 generations
Experiment #19: Super Nasty Variant never evolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment