Skip to content

Instantly share code, notes, and snippets.

View photocyte's full-sized avatar

Timothy R. Fallon, PhD photocyte

View GitHub Profile
#! /bin/bash
## See also https://github.com/nextflow-io/nextflow/discussions/4308
## cd to a parent directory for a Nextflow pipeline executation, i.e. contains .nextflow and work directories
## Find work directories essential to the last pipeline run, as absolute paths
nextflow log last > /tmp/preserve_dirs.txt
## Find all work directories, as absolute paths
>R1_readthrough
AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC[NACTG]{7,9}ATCTCGTATGCCGTCTTCTGCTTG[A]{5,7}
>R2_readthrough
AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT[A]{5,7}
>Illumina Single End Apapter 1
ACACTCTTTCCCTACACGACGCTGTTCCATCT
>Illumina Single End Apapter 2
CAAGCAGAAGACGGCATACGAGCTCTTCCGATCT
>Illumina Single End PCR Primer 1
AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT
>Illumina Single End PCR Primer 2
CAAGCAGAAGACGGCATACGAGCTCTTCCGATCT
>Illumina Single End Sequencing Primer
ACACTCTTTCCCTACACGACGCTCTTCCGATCT
##No warranty expressed or implied. Hackish code to use Seqkit to "rotate" circular genomes.
##Intended for use with Quiver, or other polishing tools like Pilon, to ensure that
##there isn't a coverage drop at the end of the circular sequence which prevents calling of the true consensus bases
##Graphmap is the only aligner that I'm aware of that properly deals with circular references!
FASTA=$1
NAME=$(basename $FASTA)
LENGTH=$(seqkit stat $FASTA | grep "FASTA" | tr -s " "| cut -f 5 -d " " | tr -d ",")
HALF=$(expr $LENGTH / 2 )
##Abusing file descriptors to have graphmap stream/pipe the .sam output into samtools for filtering and compression to .bam
/lab/solexa_weng/testtube/graphmap/bin/Linux-x64/graphmap align -r mito_reference.fasta -d raw_reads.fasta -P -o /dev/fd/3 3>&1 1>&2 | samtools view -b -h -F 4 > aligned_reads.bam
##Total hits
cat full_table_*.tsv | grep -v "Missing" | grep -v "#" | cut -f 3 | sort | uniq -c
##Hits broken down by complete/fragmented/duplicated status
cat full_table_*.tsv | grep -v "Missing" | grep -v "#" | sort -k 2,3 | cut -f 2,3 | uniq -c
CREATE DATABASE orthomcl;
CREATE USER 'orthomcl'@'localhost' IDENTIFIED BY 'orthomcl!';
GRANT ALL PRIVILEGES ON orthomcl.* TO 'orthomcl'@'localhost';
grep -v -f <(cat full_table_eukaryota_odb9.tsv | grep -v "#" | sort -u -k 3,3) full_table_eukaryota_odb9.tsv
zcat Athal_to_Goffic_blastp_out6.tsv.gz | grep -P -v "^#" | sort -u -nr -k12,12 -k1,1 | sort -u -k1,1