Instantly share code, notes, and snippets.

# cjbayesian/HW_sim.R

Created January 24, 2016 20:50
Star You must be signed in to star a gist
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
 ################################################################################################# #### This is a simulation to demonstrate how real populations reach Hardy Weinburg equilibrium #### under random mating. #### Author: Corey Chivers, 2011 ################################################################################################# cross<-function(parents) { offspring<-c('d','d') #initiate a child object offspring<-sample(parents[1,],1) offspring<-sample(parents[2,],1) return(offspring) } random_mating<-function() { tmp_pop<-pop for(n in 1:N) { parents<-sample(1:N,2) tmp_pop[n,]<-cross(pop[parents,]) } pop<-tmp_pop } genotypes<-c('A','a') p<-0.5 q<-1-p N=200 a_freq<-c(p,q) pop<-array(sample(genotypes,2*N,p=a_freq,replace=T),dim=c(N,2)) I<-200 num_generations=1 g_freq<-array(dim=c(I,3)) p_vec<-array(dim=I) for(i in 1:I) { p<-runif(1,0,1) q<-1-p a_freq<-c(p,q) pop[,1]<-sample(genotypes,N,p=a_freq,replace=T) pop[,2]<-sample(genotypes,N,p=a_freq,replace=T) for(g in 1:num_generations) random_mating() f_aa<-0 f_Aa<-0 f_AA<-0 for(n in 1:N) { if(identical(pop[n,],c('A','A'))) f_AA=f_AA+1 if(identical(pop[n,],c('A','a')) || identical(pop[n,],c('a','A') )) f_Aa=f_Aa+1 if(identical(pop[n,],c('a','a'))) f_aa=f_aa+1 } f_aa<-f_aa/N f_Aa<-f_Aa/N f_AA<-f_AA/N g_freq[i,]<-c(f_AA,f_Aa,f_aa) p_vec[i]<-p } pdf('HW.pdf') ## Plot the sims plot(p_vec,g_freq[,1],col='green',xlab='p, or 1-q',ylab='') points(p_vec,g_freq[,2],col='red') points(p_vec,g_freq[,3],col='blue') ## Theoretical Curves curve(x^2,col='green',add=T) curve((1-x)^2,col='blue',add=T) curve(2*x*(1-x),col='red',add=T) dev.off()

### mtaylor-semo commented Jan 24, 2019

Thank you for your code and effort. I am adapting your code for a Shiny app. I want to clarify why you used `I` to dimension your arrays instead of `N`. I assume you wanted to ensure 200 points are plotted every time. I also assume that it is coincidence that `I` and `N` are the same values inititally. (I read your blog post stating that your students changed the value of `N`.)

If I set `I = N`, and set `N` to some value greater than 200, then the number of plotted points increases, but if I leave `I = 200`, then I get what appears to be 200 points, regardless of `N`. Would you mind clarifying your use of `I` in this context? Thanks again!

### cjbayesian commented Feb 19, 2020

`I` represents the number of populations to simulate, each with a random fraction `p`. `N` is the size of each population. So, if you increase `I`, you'll simulate more populations. If you increase `N`, you'll increase the number of individuals in each population. The bigger the population, the closer the genotype proportions will be to those predicted by HW.