Skip to content

Instantly share code, notes, and snippets.

View SamStudio8's full-sized avatar

Sam Nicholls SamStudio8

View GitHub Profile
import sys
THRESHOLD = 0.25 # reads must have 25% of their k-mers assigned
for line in sys.stdin:
fields = line.strip().split()
kmers_fields = fields[4:]
total_kmers = sum([int(x.split(":")[1]) for x in kmers_fields])
unassigned_kmers = sum([int(x.split(":")[1]) for x in kmers_fields if x[0] == "0"])
@SamStudio8
SamStudio8 / get_pion_signal.sh
Created October 29, 2018 12:06
get_pion_signal.sh
# A little bash script to download our juicy ONT PION data
# If this doesn't work for you, express your disappointment to @samstudio8.
# Use "EVEN" or "LOG" for $1, eg: `bash get_pion_signal.sh EVEN`
MODE=$1
echo "Fetching signal blocks. Don't be afraid to CRTL+C and resume if needed..."
for i in {00..25}
do
echo $i;
wget -c https://nanopore.s3.climb.ac.uk/Zymo-PromethION-$MODE-BB-SN_signal.tar.$i;
@SamStudio8
SamStudio8 / base.Dockerfile
Last active August 2, 2019 09:40
Compile gpu-racon for the ONT Promethion with CUDA 9.0.176
# grab an off the shelf container with cuda9
FROM nvidia/cuda:9.0-devel-ubuntu16.04
# update gcc to gcc-6 as the default gcc-5 is too old
RUN apt-get update && apt-get install -y software-properties-common wget git
RUN add-apt-repository ppa:ubuntu-toolchain-r/test
RUN apt-get update && apt-get install -y gcc-6 g++-6
# update cmake as the default is too old
RUN wget -qO- "https://cmake.org/files/v3.15/cmake-3.15.1-Linux-x86_64.tar.gz" | tar --strip-components=1 -xz -C /usr/
library(tidyverse)
assemblies=read_tsv('kraken_summary.bond.tsv')
short_name <- c(
"Bacillus subtilis" = "bs",
"Cryptococcus neoformans" = "cn",
"Enterococcus faecalis" = "ef",
"Escherichia coli" = "ec",
"Lactobacillus fermentum" = "lf",
@SamStudio8
SamStudio8 / shredder.py
Last active September 2, 2021 12:41
A very bad read generator
import argparse
import sys
import random
import numpy as np
import pysam
parser = argparse.ArgumentParser(description="A very very very bad read generator.")
parser.add_argument("shred", type=int)
parser.add_argument("fasta")
@SamStudio8
SamStudio8 / list_counties.sh
Last active March 23, 2020 13:18
Generate a list of UK authorities
rm counties.*
rm *csv
# England ######################################################################
curl -O -J -L https://www.registers.service.gov.uk/registers/local-authority-eng/download-csv
# https://www.datadictionary.nhs.uk/data_dictionary/nhs_business_definitions/l/local_authority_de.asp?
# A Local Authority, in relation to England is:
## a County Council
awk -F',' '$6=="CTY" {print $7}' local-authority-eng.csv >> counties.eng.ls
@SamStudio8
SamStudio8 / override_docker_workflow.sh
Created March 30, 2020 12:32
RAMPART + docker + gridion
#!/bin/bash
USER_ID=${LOCAL_USER_ID:-9001}
echo "starting with UID : $USER_ID"
echo "creating RAMPART user"
useradd --shell /bin/bash -u $USER_ID -o -c "" -m rampart
echo "raising RAMPART on $CLIENT $SERVER"
@SamStudio8
SamStudio8 / go.sh
Created June 16, 2020 15:37
building the kraken2-microbial database
# kraken2-microbial database
## Monday 3rd September 2018
## s.nicholls.1
KDB=$1
kraken2-build --download-taxonomy --threads 24 --db $KDB
kraken2-build --download-library archaea --db $KDB
kraken2-build --download-library bacteria --db $KDB
@SamStudio8
SamStudio8 / collect.nf
Created July 28, 2021 17:00
collect_nf
#!/usr/bin/env nextflow
params.testdir = "/cephfs/covid/software/sam/nf-test/"
params.fofn_single = [params.testdir, 'samples.csv'].join('/')
single_manifest_ch = Channel
.fromPath(params.fofn_single) // open say, a CSV
.splitCsv(header:true) // split text stream into CSV records
.map { row-> file(row.path) } // coerce the record and return a file for each line
.collect() // emit all items as one
@SamStudio8
SamStudio8 / django-unchecked-migration-pre-commit
Created January 21, 2022 12:14
Django unchecked migration pre-commit
#!/usr/bin/env python3
import sys
import subprocess
p = subprocess.Popen("git ls-files --others --exclude-standard", shell=True, stdout=subprocess.PIPE)
out, err = p.communicate()
migrations = []
for unchecked_rp in out.decode('UTF-8').split('\n'):
if "migrations" in unchecked_rp: