Skip to content

Instantly share code, notes, and snippets.

@tomsing1
Created June 7, 2021 21:58
Show Gist options
  • Save tomsing1/db3376b2789cb85c5f4b1dc9a9fa30f4 to your computer and use it in GitHub Desktop.
Save tomsing1/db3376b2789cb85c5f4b1dc9a9fa30f4 to your computer and use it in GitHub Desktop.
Shell script to sub-sample a BAM file
# Shell function to subsample to a fixed number of alignments,
# requiring the sambamba and samtools suites to be available.
# see https://www.biostars.org/p/76791/
function SubSample {
local FACTOR=$(samtools idxstats $1 | cut -f3 | \
awk -v COUNT=$2 'BEGIN {total=0} {total += $1} END {print COUNT/total}')
if [[ $FACTOR > 1 ]]
then
echo '[ERROR]: Requested number of reads exceeds total read count in' $1 '-- exiting' && exit 1
fi
sambamba view -s $FACTOR -t 2 -f bam -l 5 $1
}
# example:
SubSample original.bam 10000 > subsampled.bam
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment