Skip to content

Instantly share code, notes, and snippets.

@ngcrawford
Last active December 20, 2015 04:58
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ngcrawford/6074386 to your computer and use it in GitHub Desktop.
Save ngcrawford/6074386 to your computer and use it in GitHub Desktop.
Python script for running multiple bowtie2 jobs with a variety of input fastq and read groups.
  1. Modify the sample_info.txt appropriately. I'd recommend doing this in excel and exporting the sheet as a CSV file. The first lane is a header and cannot be changed.
  2. Run the bowtie2 commands with python run_bowtie2.py
import shlex
from subprocess import Popen, PIPE
def run_bowtie2(args_dict):
print 'running {0[sample_id]}\n'.format(args_dict)
cli1 = "\
bowtie2 \
-p {0[processors]} \
--very-sensitive-local \
--rg ID:{0[plate]}.{0[lane]} \
--rg SM:{0[sample_id]} \
--rg LB:{0[library_id]} \
--rg PL:ILLUMINA \
--rg-id {0[plate]}.{0[lane]} \
-x {0[bwt2_idx]} \
-1 {0[fq1]} \
-2 {0[fq2]}".format(args_dict)
cli2 = "samtools view -Sb -"
cli1 = shlex.split(cli1)
c1_out = Popen(cli1, stdout=PIPE, stderr=PIPE)
cli2 = shlex.split(cli2)
c2_out = Popen(cli2,
stdin=c1_out.stdout,
stdout=open('{0[sample_id]}.bam'.format(args_dict), 'wb')).communicate()
c1_out.stdout.close() # Allow p1 to receive a SIGPIPE if p2 exits.
if __name__ == '__main__':
IDs = None
with open('sample_info.txt','rU') as fin:
for count, line in enumerate(fin):
line_parts = line.strip().split(",")
if count is 0:
IDs = line_parts
continue
if len(line_parts) != len(IDs):
break
else:
b = dict(zip(IDs, line_parts))
run_bowtie2(b)
processors,plate,lane,bwt2_idx,sample_id,library_id,fq1,fq2
4,P1,4,test_run.RADnome,sample1,sample1,fq1.10k.fq,fq2.10k.fq
4,P1,4,test_run.RADnome,sample1,sample1,fq1.10k.fq,fq2.10k.fq
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment