Skip to content

Instantly share code, notes, and snippets.

@jonesken
jonesken / 1x50bp_QC_commandline.py
Created August 20, 2013 13:24
This is a script written to remove the Q15 tails off of Illumina 1x50bp fastQ files generated from CASAVA v1.8, and to remove sequences that are less than 40bp after trimming.
import glob
import os, sys
from Bio.SeqIO.QualityIO import FastqGeneralIterator
where = os.getcwd()
who = sys.argv[1]
for filename in glob.glob(os.path.join(where,who)):
count = 0
infile = open(filename)
@jonesken
jonesken / QC_for_PAL_finder.py
Created August 13, 2013 22:42
This is a script to trim the low quality tails off of Illumina fastQ data, removes sequences that are trimmed below 75bp, and finally writes out the first 5 million pairs of reads that pass QC.
import os, sys
from Bio import SeqIO
from Bio.SeqIO.QualityIO import *
import itertools
filename1 = sys.argv[1] ## Read in $File from .SH script
good_pairs = 0
count = 0
filename2 = filename1[:filename1.find("_R1_")] + "_R2_" + filename1[filename1.find("_R1_")+4:] ## take filename1 split it by the "_R1_" and replace it with "_R2"