Skip to content

Instantly share code, notes, and snippets.

@nievergeltlab
Created April 19, 2017 20:55
Show Gist options
  • Save nievergeltlab/13a1250f70f9ea467c4c9c60b9b3df30 to your computer and use it in GitHub Desktop.
Save nievergeltlab/13a1250f70f9ea467c4c9c60b9b3df30 to your computer and use it in GitHub Desktop.
Descriptive statistics for genotype merge discordances
#Merge data in PLINK, using merge mode 6 or 7
./plink --bfile YEHUDA --bmerge YEHUDA.bed YEHUDA.bim YEHUDA.fam --merge-mode 7 --out yehude-merge
R
setwd('F:/rutgers_2')
dat <- read.table('yehude-merge.diff', header=T,nr=800000,stringsAsFactors=F)
library(plyr)
#Determine general amount of disagreement for each SNP. Ones that have especially high disagreement may be badly genotyped
quantile(table(dat$SNP))
#Plot it
hist(table(dat$SNP) )
#Count discordances for each subject by getting dimension of data
#Returns a dataframe with N merged subjects rows, column 2 is the number of discordances
dimcheck <- ddply(dat, ~IID, dim)
#See average discordances by subject
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment