Skip to content

Instantly share code, notes, and snippets.

@explodecomputer
Last active August 29, 2015 14:01
Show Gist options
  • Save explodecomputer/0948f35d6ed841cc70e4 to your computer and use it in GitHub Desktop.
Save explodecomputer/0948f35d6ed841cc70e4 to your computer and use it in GitHub Desktop.
extractSnps.sh
#!/bin/bash
# set -e
snplistfile=${1}
plinkrt=${2}
outfile=${3}
touch ${outfile}_mergelist.txt
rm ${outfile}_mergelist.txt
touch ${outfile}_mergelist.txt
firstchr="1"
flag="0"
for i in {01..23}
do
filename=$(sed -e "s/CHR/$i/g" <<< ${plinkrt})
echo ""
echo "$filename"
echo "${outfile}_${i}"
echo ""
plink1.90 --noweb --bfile ${filename} --extract ${snplistfile} --make-bed --out ${outfile}_${i}
echo "$?"
if [ -f "${outfile}_${i}.bed" ]; then
echo "${outfile}_${i}.bed ${outfile}_${i}.bim ${outfile}_${i}.fam" >> ${outfile}_mergelist.txt
if [ "${flag}" == "0" ]; then
firstchr=${i}
fi
flag="1"
fi
done
sed -i 1d ${outfile}_mergelist.txt
plink1.90 --noweb --bfile ${outfile}_${firstchr} --merge-list ${outfile}_mergelist.txt --make-bed --out ${outfile}
rm ${outfile}_*
@explodecomputer
Copy link
Author

The 1000 genomes imputed ALSPAC data is split into separate chromosomes because a single file would be very large. This script can be used to extract a list of SNPs from any of the chromosomes and combine them into a single bed/bim/fam plink file set.

It uses plink1.90 because it is much, much faster than original plink. Download from here:
https://www.cog-genomics.org/plink2/
move the executable to a folder called ~/bin in your home drive, and add this directory to your path by including this line in .bash_profile:

PATH=$PATH:$HOME/bin

Then to run the script you would save it somewhere, set permissions to executable e.g.

chmod 755 extractSnps.sh

and then run something like:

./extractSnps.sh /panfs/panasas01/shared/alspac/deprecated/alspac_combined_1kg_20140424/chrCHR/alspac_1kg_p1v3_CHR

where the is just a file with a single SNP per line, and is the name of the prefix you'd like to have before .bed/.bim/.fam

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment