Skip to content

Instantly share code, notes, and snippets.

View macmanes's full-sized avatar
🏠
Working from home

Matt MacManes macmanes

🏠
Working from home
View GitHub Profile
@macmanes
macmanes / sim.par
Created June 25, 2013 16:22
.par file used to simulate reads
################################################File locations
REF_FILE_NAME Mus.gtf
GEN_DIR genomes/mus/
###################################################Expression
NB_MOLECULES 5000000
TSS_MEAN 50
POLYA_SCALE 100
@macmanes
macmanes / fix.seq.sh
Created June 25, 2013 16:25
code to remove reads whose QUAL != SEQ length
#!/bin/bash
cat test.fastq | awk 'BEGIN{OFS="\n"} {
a[NR % 4] = $0;
if(NR % 4 == 0 && length(a[2]) == length(a[0])){
print a[1],a[2],a[3],a[0]
}
}' > sim.fastq
@macmanes
macmanes / allp.ec.sh
Created June 25, 2013 16:26
Run AllPathsLG error correction
#!/bin/bash
~/ErrorCorrectReads.pl \
MAX_MEMORY_GB= 30 THREADS= 8 PHRED_ENCODING= 33 PAIRED_READS_A_IN= \
PAIRED_READS_B_IN= UNPAIRED_READS_IN= sim.fastq \
FE_MAX_KMER_FREQ_TO_MARK= 0 EC_K= 24 HAPLOIDIFY= True FILL_FRAGMENTS= \
False FF_K=28 FE_USE_KMER_SPECTRUM=TRUE READS_OUT=corr
@macmanes
macmanes / rept.config.txt
Created June 25, 2013 16:28
config file for running Reptile
InFaFile data/both.fa
IQFile data/both.q
OErrFile data/both.reptile.err
QFlag 1
IFlag 1
BatchSize 5000000
KmerLen 13
hd_max 1
@macmanes
macmanes / trin.sh
Last active December 18, 2015 23:09
code for running trinity
#!/bin/bash
~/trinityrnaseq_r2013-02-25/Trinity.pl --seqType fq --JM 30G \
--left left.fastq -right right.fastq --full_cleanup --CPU 8
@macmanes
macmanes / rept.pipeline.sh
Created June 27, 2013 17:57
This is a list of commands that can be used to produce error corrected reads using Reptile.
#!/bin/bash
#Commands for producing error corrected reads
#Trimmomatic available at: http://www.usadellab.org/cms/index.php?page=trimmomatic
#Reptile available at: http://aluru-sun.ece.iastate.edu/doku.php?id=reptile
#Trinity available at: http://trinityrnaseq.sourceforge.net/
######Trim reads with Trimmomatic
java -jar -Xmx10g trimmomatic-0.30.jar PE -phred33 -threads 32 \
@macmanes
macmanes / loop
Created October 1, 2013 12:38
makefile loop
10M.2.Trinity.fasta 10M.5.Trinity.fasta 10M.10.Trinity.fasta 10M.20.Trinity.fasta raw.10M.Trinity.fasta \
10M.2.Trinity.fasta.pslx 10M.5.Trinity.fasta.pslx 10M.10.Trinity.fasta.pslx 10M.20.Trinity.fasta.pslx raw.10M.Trinity.fasta.pslx \
10M.2.Trinity.fasta.pep 10M.5.Trinity.fasta.pep 10M.10.Trinity.fasta.pep 10M.20.Trinity.fasta.pep raw.10M.Trinity.fasta.pep \
10M.2.xprs 10M.5.xprs 10M.10.xprs 10M.20.xprs raw.10M.xprs: 10M.left.2.fq 10M.left.5.fq 10M.left.10.fq 10M.left.20.fq 10M.right.2.fq 10M.right.5.fq 10M.right.10.fq 10M.right.20.fq
for TRIM in 20 2 5 10 0; do \
$(TRINITY)/Trinity.pl --full_cleanup --min_kmer_cov 1 --seqType fq --JM $(MEM)G --bflyHeapSpaceMax $(MEM)G \
--left 10M.left.$$TRIM.fq --right 10M.right.$$TRIM.fq --group_pairs_distance 999 --CPU $(CPU) --output 10M.$$TRIM; \
##FL Reconstruction
$(TRINITY)/Analysis/FL_reconstruction_analysis/FL_trans_analysis_pipeline.pl --target $(MUS) --query 10M.$$TRIM.Trinity.fasta; rm *maps *selected *summary; \
##ORF ID
##Input
##n n:500 n:N50 min N80 N50 N20 E-size max sum name
##34 34 9 18562 111345 159114 321119 198500 406529 4313902 test.fa
##
##gm_es.pl test.fa --BP OFF > GenemarkES.log 2>&1
running hmm2nt.a2
74 files IN
Clusters were defined as:
#Using Fastool downloaded with the most recent version of Trinity.
fastool --rev --illumina-trinity --to-fasta pero_left.2.fastq >> left.fa
Sequences parsed: 164311463
#I have a number of reads that are not appended with the /1 that is supposed to happen with --illumina-trinity
grep HSQ-7001360:67:H88RHADXX:1:1101:10000:20192 left.fa
>HSQ-7001360:67:H88RHADXX:1:1101:10000:20192
wget http://downloads.sourceforge.net/project/trinityrnaseq/trinityrnaseq_r20140413p1.tar.gz?r=http%3A%2F%2Fsourceforge.net%2Fprojects%2Ftrinityrnaseq%2Ffiles%2F&ts=1400703099&use_mirror=superb-dca2
tar -zxf trinityrnaseq_r20140413p1.tar.gz
cd trinityrnaseq_r20140413p1
make