Skip to content

Instantly share code, notes, and snippets.

@rhshah
Forked from dakl/tcga-maf-to-vcf.sh
Created January 29, 2020 18:32
Show Gist options
  • Save rhshah/45006892e2bfac6f9c8818d1ea7c2abf to your computer and use it in GitHub Desktop.
Save rhshah/45006892e2bfac6f9c8818d1ea7c2abf to your computer and use it in GitHub Desktop.
Convert TCGA MAFs to VCFs to use to annotate other VCF files with variant frequency in the various TCGA projects (including pancan12).
REF=/proj/b2010040/private/nobackup/autoseqer-genome/genome/human_g1k_v37_decoy.fasta
DICT=/proj/b2010040/private/nobackup/autoseqer-genome/genome/human_g1k_v37_decoy.dict
GATKJAR=/home/daniel.klevebring/projects/tcga-maf-to-vcf/GenomeAnalysisTK.jar
WORKDIR=$HOME/Crisp/dakl/tcga-maf-to-vcf
mkdir -p $WORKDIR
cd $WORKDIR
#unpack maf to workdir
tar xvfz ~/projects/tcga-maf-to-vcf/pancan_cleaned_mafs.tar.gz
if [ ! -f maf2vcf.pl ]; then
wget https://raw.githubusercontent.com/ckandoth/vcf2maf/master/maf2vcf.pl
fi
if [ ! -f vcfsorter.pl ]; then
wget https://gist.github.com/dakl/b14aa4648a1ef17cee8c/raw/f1de9791460a3a1f60217cd2e6c0cd72ae78ae91/vcfsorter.pl
fi
#MAF=/home/daniel.klevebring/Crisp/dakl/tcga-maf-to-vcf/somatic_mafs_cleaned/thca_cleaned.maf
MAFS=(`find $WORKDIR/somatic_mafs_cleaned|grep maf$`)
for MAF in ${MAFS[@]}; do
TMP=`basename $MAF`
GRP=tcga-${TMP/_cleaned.maf/}
echo -n "$GRP"
mkdir -p $GRP
perl maf2vcf.pl --input-maf $MAF --output-dir $GRP --ref-fasta $REF
VCFS=(`find $GRP|grep vcf$`)
#VCF=laml/TCGA-AB-2802-03B-01W-0728-08_vs_TCGA-AB-2802-11B-01W-0728-08.vcf
# sort the generated VCFs
for VCF in ${VCFS[@]}; do
SORTEDVCF=${VCF/.vcf/_sorted.vcf}
perl vcfsorter.pl $DICT $VCF |vt normalize -r $REF - |bgzip > $SORTEDVCF
tabix -p vcf $SORTEDVCF
#rm $VCF
done
SORTEDVCFS=(`find $GRP|grep _sorted.vcf$`)
FINALVCF=${GRP}-somatic.vcf.gz
bcftools merge --merge none --force-samples ${SORTEDVCFS[@]} |bgzip > $FINALVCF
tabix -p vcf $FINALVCF
echo " done."
rm -r $GRP
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment