Skip to content

Instantly share code, notes, and snippets.

@rhallPB
rhallPB / CCS_kinetics.sh
Created June 30, 2020 16:51
Running single molecule kinetic generation
# A few notes on running CCS with kinetics output, including native format and links
# for converting to a format compatible with previous tools.
# requires pbccs, available from bioconda https://anaconda.org/bioconda/pbccs
conda install -c bioconda pbccs
ccs <IN.subreads.bam> <OUT.ccs.bam> --mean-kinetics
# output ccs file will have the following extra tags:
@rhallPB
rhallPB / DupReads.py
Created April 20, 2020 21:16
Single molecule base mods
#!/usr/bin/env python
# coding: utf-8
# In[173]:
import pysam
import argparse
import sys
import os.path
@rhallPB
rhallPB / gist:34ce4a1cab72d861684928f618406863
Created June 20, 2019 18:55
add coverage stats to canu output
cat $fin | awk '{if (/^>/) {print $1"_"$4} else {print}}' | sed 's/Stat=/_/g'
@rhallPB
rhallPB / missedAdapter.sh
Created October 8, 2015 21:23
Find missing adapter reads
#!/bin/bash
fastarevcomp ${1%%.*}.fasta > ${1%%.*}_revc.fasta
sdpMatcher ${1%%.*}.fasta ${1%%.*}_revc.fasta 10 -local > pal.out
~rhall/projects/MDA/findPal/parseSDP.pl pal.out
@rhallPB
rhallPB / vec.pl
Last active August 31, 2015 20:39
Vector validation
#!/usr/bin/env perl
# First Calculate the lengths of all the vectors with the shell command
# "fastalength D276_linearizedVectors.fa > vectorLengths.list"
# Then run this small perl script to subset the alignment and calculate consensus / variants.
while(<>){
split;
@rhallPB
rhallPB / mothur bash
Created June 12, 2015 21:05
Mothur species id
### /home/UNIXHOME/asethuraman/projects/jgi/cami/cami_rDna
# Now we'd like to go ahead and classify all of our sequences from the different
# libraries and the mock community reference using the RDP, greengenes, and
# SILVA training sets.
mothur "#classify.seqs(fasta=rDna/cami_ROI.good.pick.filter.unique.precluster.fasta, reference=/home/UNIXHOME/asethuraman/projects/schloss/16S_Schloss/references/trainset10_082014.pds.fasta, taxonomy=/home/UNIXHOME/asethuraman/projects/s
chloss/16S_Schloss/references/trainset10_082014.pds.tax, processors=8);
classify.seqs(fasta=rDna/cami_ROI.good.pick.filter.unique.precluster.fasta, reference=/home/UNIXHOME/asethuraman/projects/schloss/16S_Schloss/references/gg_13_8_99.fasta, taxonomy=/home/UNIXHOME/asethuraman/projects/schloss/16S_Schloss
/references/gg_13_8_99.gg.tax, processors=8);
@rhallPB
rhallPB / getReads.pl
Created June 12, 2015 00:17
Get reads aligning to contig from sam file
#!/usr/bin/env perl
open FIN, $ARGV[0];
while(<FIN>){
if (/$ARGV[1]\t/){}
else{
split;
print $_[0]."\n";
@rhallPB
rhallPB / RS_HGAP_Assembly_BAC.3.xml
Created June 4, 2015 21:27
common/protocols/RS_HGAP_Assembly_BAC.3.xml
<?xml version="1.0" encoding="utf-8"?><smrtpipeSettings>
<protocol id="RS_HGAP_Assembly.3" version="2.2.0" editable="false">
<application>De novo assembly</application>
<param name="name" label="Protocol Name">
<value>RS_HGAP_Assembly_3</value>
<input type="text"/>
<rule required="true"/>
</param>
<param name="description">
<value>(BETA) HGAP version 3. PacBio de novo assembler optimized for speed.</value>
@rhallPB
rhallPB / PreAssemblerBacHGAP.3.xml
Created June 4, 2015 21:26
common/protocols/assembly/PreAssemblerBacHGAP.3.xml
<?xml version="1.0" ?>
<smrtpipeSettings>
<module id="P_PreAssemblerDagcon" label="PreAssembler v2" editableInJob="true">
<title>Using DAG-based consensus algorithm, pre-assemble long reads as the first step of the Hierarchical Genome Assembly process (HGAP). Version 2 is a stepping stone for scaling to much larger genomes.</title>
<param name="computeLengthCutoff" label="Compute Minimum Seed Read Length" editable="true">
<title>Specify whether or not to compute the minimum seed read length that results in at least 30X target genome coverage, by the longest subreads. This is based on the genome size you specified.</title>
<value>True</value>
<input type="checkbox" />
</param>
<param name="minLongReadLength" label="Minimum Seed Read Length">
@rhallPB
rhallPB / P_PreAssemblerDagconBAC.py
Created June 4, 2015 19:27
analysis/lib/python2.7/SMRTpipe/modules/P_PreAssemblerDagconBAC.py
import os
import re
import logging
from SMRTpipe.engine.SmrtPipeFiles import (SMRTFile, SMRTDataFile,
SMRTReportFile, cmdLineInput,
SMRTJsonReportFile)
from SMRTpipe.engine.SmrtPipeTasks import task
from SMRTpipe.engine.DistributableTasks import (DistributableTask,
LocallyDistributableTask)