Jean Elbers jelber2

## Running-PBJelly-example-with-indel-correction-with-BBMap-no-Pilon.txt
# goes along with http://seqanswers.com/forums/showthread.php?p=220925#post220925
#
# assumes you have PBJelly, blasr, tabix, bcftools, samtools installed
# below I am using a machine with 70 cores on a single node, adjust to the number of cores to your machine
# The scripts below are obviously not designed for use with a cluster, but can be modified
#
#########################
# STEP 1 Combine the FASTQ files and remove the originals to save space
#########################
## first combine files and delete the originals to save space

## pilon-runs-1-2.sh
#! /bin/bash

set -e

# installing fasta-splitter.pl
## wget http://kirill-kryukov.com/study/tools/fasta-splitter/files/fasta-splitter-0.2.6.zip
## unzip fasta-splitter-0.2.6.zip

# assumes initial genome to be error-corrected by pilon is called
## genome.pilon-0.fasta

## worm-adapter-trimming.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                jelber2
                / worm-adapter-trimming.md
            
            
              Created
              August 10, 2022 10:34
            
          
    Adapter trimming with BBDuk 38.82

module load bbtools/38.82
module load bcftools/1.14
bbduk.sh threads=8 \
in1=raw/170283.mate1.fastq.gz \
in2=raw/170283.mate2.fastq.gz \
out1=fastq/170283-trimmed.mate1.fastq.gz \
out2=fastq/170283-trimmed.mate2.fastq.gz \

  
## gist:451eec8c6b74617b8bf0532905f256c1
This was with https://zymo-files.s3.amazonaws.com/BioPool/ZymoBIOMICS.STD.refseq.v2.zip

RAW_SUP_Duplex                                         pg_asm_1x_corrected_SUP_duplex                         pg_asm_2x_corrected_SUP_duplex                         pg_asm_3x_corrected_SUP_duplex
Bacillus_subtilis                                      Bacillus_subtilis                                      Bacillus_subtilis                                      Bacillus_subtilis
# target bases: 4041255                                # target bases: 4041255                                # target bases: 4041255                                # target bases: 4041255
# target bases overlapping regions: 4041255 (100.00%)  # target bases overlapping regions: 4041255 (100.00%)  # target bases overlapping regions: 4041255 (100.00%)  # target bases overlapping regions: 4041255 (100.00%)
1159311 reference bases covered by exactly one contig  3791080 reference bases covered by exactly one contig  3642732 reference bases covered by exa

## README.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                jelber2
                / README.md
            
            
              Last active
              October 18, 2023 13:41
            
              
                MethPhaser
              
          
    MethPhaser installation

Install micromamba or mamba or conda
# Install micromamba
"${SHELL}" <(curl -L micro.mamba.pm/install.sh)
You will then see something like this in a BASH shell (parts with "(type....)" are added for instructions

  
## README.txt
# output from best commit #fcdfa97 (https://github.com/google/best), .summary_identity_stats.csv files using reads
# aligned to concatenated chr20_MATERNAL and chr20_PATERNAL from hg002v1.0.1.fasta.gz (https://github.com/marbl/HG002) (https://s3-us-west-2.amazonaws.com/human-pangenomics/T2T/HG002/assemblies/hg002v1.0.1.fasta.gz)
# using mm2-fast commit # 10bde16 using settings: --eqx --secondary=no -Y -c -ax map-ont -k 19 -w 13 -t 48
# or using these settings for Illumina NextSeq2000 reads: -t 48 --eqx --secondary=no -acx sr
#
# brutal_rewrite (br) commit # ad87f92 (https://github.com/natir/br) using settings: -k 19 -m graph
# kmer read filter (kmrf) commit # 36cad24 (https://github.com/natir/kmrf) using setting: -k 17
# peregrine-2021 (pg_asm) commit # 6698eb1 (https://github.com/cschin/peregrine-2021): using default settings
#
# herro (herro) commit # c41dc30 (https://github.com/lbcb-sci/herro) using defaults and model at time of commit

## shredBAM.jl
# previous versions allowed to generate directly from BAM, but there seemed to have been some troubles
# between versions of XAM, so this version is working so far for its intended purpose on https://github.com/brendanofallon/jovian
# also, this version is rather fast
#
#
# note: ignores quality scores at the moment - fixed in revision#4
# note2: outputs to STDOUT a SAM file without a header
# tested with julialang v1.10.2 and XAM v0.4.0
# $ julia shredBAM.jl input.bam 300 > test.sam.noheader
#

## stitch-fastq.jl
# tested on Julialang v.1.10.2 , DataStructures v0.18.16, and FASTX v2.1.4
# Usage
# $ julia stitch-fastq.jl chr20.herro.fasta.Q30.recal.shred.fastq > chr20.herro.fasta.Q30.recal.fastq
#
import Pkg; Pkg.add("FASTX")
import Pkg; Pkg.add("DataStructures")
using DataStructures
using FASTX

function process_fastq_file(filename::String)

## stitch-fasta.jl
# tested on Julialang v.1.10.2, DataStructures v0.18.16, and FASTX v2.1.4
# Usage
# $ julia stitch-fasta.jl chr20.herro.fasta.Q30.recal.shred.fasta > chr20.herro.fasta.Q30.recal.fasta
#
import Pkg; Pkg.add("FASTX")
import Pkg; Pkg.add("DataStructures")
using DataStructures
using FASTX

function process_fasta_file(filename::String)

## PlotSequenceTime.jl
using XAM
using ArgParse
using DataFrames
using Dates
using Plots
using StatsBase
using Plots.PlotMeasures


function parse_commandline()
	# goes along with http://seqanswers.com/forums/showthread.php?p=220925#post220925
	#
	# assumes you have PBJelly, blasr, tabix, bcftools, samtools installed
	# below I am using a machine with 70 cores on a single node, adjust to the number of cores to your machine
	# The scripts below are obviously not designed for use with a cluster, but can be modified
	#
	#########################
	# STEP 1 Combine the FASTQ files and remove the originals to save space
	#########################
	## first combine files and delete the originals to save space
	#! /bin/bash

	set -e

	# installing fasta-splitter.pl
	## wget http://kirill-kryukov.com/study/tools/fasta-splitter/files/fasta-splitter-0.2.6.zip
	## unzip fasta-splitter-0.2.6.zip

	# assumes initial genome to be error-corrected by pilon is called
	## genome.pilon-0.fasta
	This was with https://zymo-files.s3.amazonaws.com/BioPool/ZymoBIOMICS.STD.refseq.v2.zip

	RAW_SUP_Duplex pg_asm_1x_corrected_SUP_duplex pg_asm_2x_corrected_SUP_duplex pg_asm_3x_corrected_SUP_duplex
	Bacillus_subtilis Bacillus_subtilis Bacillus_subtilis Bacillus_subtilis
	# target bases: 4041255 # target bases: 4041255 # target bases: 4041255 # target bases: 4041255
	# target bases overlapping regions: 4041255 (100.00%) # target bases overlapping regions: 4041255 (100.00%) # target bases overlapping regions: 4041255 (100.00%) # target bases overlapping regions: 4041255 (100.00%)
	1159311 reference bases covered by exactly one contig 3791080 reference bases covered by exactly one contig 3642732 reference bases covered by exa
	# output from best commit #fcdfa97 (https://github.com/google/best), .summary_identity_stats.csv files using reads
	# aligned to concatenated chr20_MATERNAL and chr20_PATERNAL from hg002v1.0.1.fasta.gz (https://github.com/marbl/HG002) (https://s3-us-west-2.amazonaws.com/human-pangenomics/T2T/HG002/assemblies/hg002v1.0.1.fasta.gz)
	# using mm2-fast commit # 10bde16 using settings: --eqx --secondary=no -Y -c -ax map-ont -k 19 -w 13 -t 48
	# or using these settings for Illumina NextSeq2000 reads: -t 48 --eqx --secondary=no -acx sr
	#
	# brutal_rewrite (br) commit # ad87f92 (https://github.com/natir/br) using settings: -k 19 -m graph
	# kmer read filter (kmrf) commit # 36cad24 (https://github.com/natir/kmrf) using setting: -k 17
	# peregrine-2021 (pg_asm) commit # 6698eb1 (https://github.com/cschin/peregrine-2021): using default settings
	#
	# herro (herro) commit # c41dc30 (https://github.com/lbcb-sci/herro) using defaults and model at time of commit
	# previous versions allowed to generate directly from BAM, but there seemed to have been some troubles
	# between versions of XAM, so this version is working so far for its intended purpose on https://github.com/brendanofallon/jovian
	# also, this version is rather fast
	#
	#
	# note: ignores quality scores at the moment - fixed in revision#4
	# note2: outputs to STDOUT a SAM file without a header
	# tested with julialang v1.10.2 and XAM v0.4.0
	# $ julia shredBAM.jl input.bam 300 > test.sam.noheader
	#
	# tested on Julialang v.1.10.2 , DataStructures v0.18.16, and FASTX v2.1.4
	# Usage
	# $ julia stitch-fastq.jl chr20.herro.fasta.Q30.recal.shred.fastq > chr20.herro.fasta.Q30.recal.fastq
	#
	import Pkg; Pkg.add("FASTX")
	import Pkg; Pkg.add("DataStructures")
	using DataStructures
	using FASTX

	function process_fastq_file(filename::String)
	# tested on Julialang v.1.10.2, DataStructures v0.18.16, and FASTX v2.1.4
	# Usage
	# $ julia stitch-fasta.jl chr20.herro.fasta.Q30.recal.shred.fasta > chr20.herro.fasta.Q30.recal.fasta
	#
	import Pkg; Pkg.add("FASTX")
	import Pkg; Pkg.add("DataStructures")
	using DataStructures
	using FASTX

	function process_fasta_file(filename::String)
	using XAM
	using ArgParse
	using DataFrames
	using Dates
	using Plots
	using StatsBase
	using Plots.PlotMeasures


	function parse_commandline()