Last active
March 28, 2022 00:58
-
-
Save ar0ch/82dd4738632db130cb7ce5b33871df9f to your computer and use it in GitHub Desktop.
Convert CEL to VCF with PLINK
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
mkdir lgen plnk vcf | |
for i in folder_of_CELs/*;do | |
j=`basename $i` | |
cat $i |tail -n +14 | pee "awk -F '\t' '{print \"FAMID\",\$1,\$2,\$5,\$6}' > lgen/$j.lgen" \ | |
"awk -F '\t' '{print \$3,\$2,'0',\$4}' > lgen/$j.map" \ | |
"awk -F'\t' '{print "FAMID",$1,'0','0','0','0'}' > lgen/$j.fam" | |
plink --lgen lgen/$j.lgen --fam lgen/$j.fam --map lgen/$j.map --make-bed --out plink/$j | |
plink --lgen lgen/$j.lgen --fam lgen/$j.fam --map lgen/$j.map --recode vcf --out vcf/$j | |
done |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This is not a particularly elegant or efficient solution but it works. We only read the CEL file once though thanks to pee
The CEL header should look something like like:
The LGEN, FAM and MAP files should look like: