Skip to content

Instantly share code, notes, and snippets.

View peterk87's full-sized avatar

Peter Kruczkiewicz peterk87

  • Canadian Food Inspection Agency
  • Canada
View GitHub Profile
@peterk87
peterk87 / multi_container_mulled_hash.py
Last active May 30, 2022 18:30
Get "mulled-v2-{hash}" for multi-package containers (https://github.com/BioContainers/multi-package-containers) comma-delimited multi-package definitions (like in the hash.tsv file)
#!/usr/bin/env python
import argparse
import hashlib
import sys
from collections import namedtuple
from typing import List, Dict, Tuple, Union
# from https://github.com/galaxyproject/galaxy/blob/f12ee5ce6d602cd4c8b4cfc2112988b84a4f255e/lib/galaxy/tool_util/deps/mulled/util.py#L185
Target = namedtuple("Target", ["package_name", "version", "build", "package"])
@peterk87
peterk87 / viralrecon-2.3-mosdepth-bed-gz.config
Created February 9, 2022 18:26
nf-core/viralrecon v2.3 Mosdepth config to publish/output BED.GZ files
// nf-core/viralrecon v2.3
// specify output of bed.gz files produced by Mosdepth
process {
withName: 'MOSDEPTH_GENOME' {
ext.args = '--fast-mode'
publishDir = [
path: { "${params.outdir}/variants/bowtie2/mosdepth/genome" },
mode: 'copy',
pattern: "*.{summary.txt,bed.gz}"
]
@peterk87
peterk87 / MN908947.3-orf1a-orf1b-gene-split.gff
Created February 1, 2022 16:53
SARS-CoV-2 MN908947.3 modified GFF with ORF1ab split into ORF1a and ORF1b for proper SnpEff variant effect analysis of ORF1b
##gff-version 3
#!gff-spec-version 1.21
#!processor NCBI annotwriter
#!genome-build ASM985889v3
#!genome-build-accession NCBI_Assembly:GCA_009858895.3
##sequence-region MN908947.3 1 29903
##species https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=2697049
MN908947.3 Genbank region 1 29903 . + . ID=MN908947.3:1..29903;Dbxref=taxon:2697049;collection-date=Dec-2019;country=China;gbkey=Src;genome=genomic;isolate=Wuhan-Hu-1;mol_type=genomic RNA;nat-host=Homo sapiens;old-name=Wuhan seafood market pneumonia virus
MN908947.3 Genbank five_prime_UTR 1 265 . + . ID=id-MN908947.3:1..265;gbkey=5'UTR
MN908947.3 Genbank gene 266 13468 . + . ID=gene-ORF1a;Name=ORF1a;gbkey=Gene;gene=ORF1a;gene_biotype=protein_coding
@peterk87
peterk87 / pangolin.config
Last active February 8, 2022 18:59
Nextflow DSL-2 config for the latest version of Pangolin
process {
withName: PANGOLIN {
container = 'quay.io/biocontainers/pangolin:3.1.20--pyhdfd78af_0'
}
}
@peterk87
peterk87 / py-illumina-bam-subsampled-genome-coverage.ipynb
Created August 30, 2021 15:44
This Jupyter Notebook is for generating line plots of genome coverage for reads subsampled from input BAM files for different samples, i.e. what is the genome coverage at a particular depth if you were to randomly subsample 1 to X reads for a particular sample.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@peterk87
peterk87 / rorate_contig.py
Created June 1, 2021 22:14
Rotate a circular contig to start at the same position as a reference sequence
#!/usr/bin/env python3
# coding: utf-8
import logging
import subprocess
from io import StringIO
from pathlib import Path
from typing import Optional
import sys
@peterk87
peterk87 / viralrecon2-modules.config
Last active February 9, 2022 18:24
Custom Mosdepth modules config for nf-core/viralrecon v2.2 to output `.bed.gz` files. Will not work for v2.3
params {
modules {
'illumina_mosdepth_genome' {
args = '--fast-mode'
publish_files = ['summary.txt':'', 'per-base.bed.gz':'', 'regions.bed.gz':'']
publish_dir = 'variants/bowtie2/mosdepth/genome'
}
'illumina_mosdepth_amplicon' {
args = '--fast-mode --use-median --thresholds 0,1,10,50,100,500'
publish_files = ['summary.txt':'', 'per-base.bed.gz':'', 'regions.bed.gz':'']
@peterk87
peterk87 / simple_covplot.py
Created May 17, 2021 22:37
Create coverage plots for multiple genomes from "samtools depth" tabular outputs
#!/usr/bin/env python
from pathlib import Path
import logging
import typer
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.gridspec as gridspec
@peterk87
peterk87 / py-ncbi-ftp-genbank-genomes-accessions-assemblies.ipynb
Created January 3, 2020 22:20
Access NCBI Genbank genome assemblies with Python
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@peterk87
peterk87 / conda-bash-completions.sh
Created September 5, 2019 14:52
Basic Conda Bash completions
_conda(){
if (( ${#COMP_WORDS[@]} < 3 )); then
COMPREPLY=( $(compgen -W "activate deactivate clean config create help info init install list package remove uninstall run search update upgrade env -h --help -V --version" -- $2 ) );
else
if [[ $3 == "activate" ]]; then
local cur="${COMP_WORDS[COMP_CWORD]}";
local envs=( $(ls "$HOME/miniconda3/envs/") );
COMPREPLY=( $(compgen -W "${envs[*]}" -- $cur) );
elif [[ $3 == "env" ]]; then
local cur="${COMP_WORDS[COMP_CWORD]}";