Skip to content

Instantly share code, notes, and snippets.

@duxan
Created April 20, 2017 15:40
Show Gist options
  • Save duxan/5270027308a9fb3e59cd0c71140da126 to your computer and use it in GitHub Desktop.
Save duxan/5270027308a9fb3e59cd0c71140da126 to your computer and use it in GitHub Desktop.
Bash script to parse MAF (minor allele frequency) for minor allele (second most frequent allele) from VCFtools .frq files.
#!/bin/bash
sed 1d $1 | while read line
do
CHROM=`echo $line | awk '{print $1}'`
POS=`echo $line | awk '{print $2}'`
num_alleles=`echo $line | awk '{print $3}'`
num_chroms=`echo $line | awk '{print $4}'`
mafs=()
for i in `seq 1 $num_alleles`; do
col=$(($i + 4))
mafs+=(`echo $line | awk -v col=$col '{print $col}' | awk -F ":" '{print $2}'`)
done
IFS=$'\n' sorted=($(sort -gr <<< "${mafs[*]}"))
unset IFS
re=".*[[:space:]]([ACGT]*:)(${sorted[1]}).*"
if [[ $line =~ $re ]]; then
MAF=`echo ${BASH_REMATCH[1]}${BASH_REMATCH[2]}`
fi
echo "$CHROM $POS $num_alleles $num_chroms $MAF"
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment