Skip to content

Instantly share code, notes, and snippets.

@tkrahn
tkrahn / bam2fastq_new.sh
Created April 13, 2022 16:11
Simple way to create FastQ files from a paired end BAM file
#!/bin/bash
clear
echo "Recovering FASTQ reads from a BAM file"
YSEQ_ID=${PWD##*/}
NUM_THREADS=$(getconf _NPROCESSORS_ONLN)
echo "We can use ${NUM_THREADS} threads."
@tkrahn
tkrahn / YSTR_hg38.bed
Created April 27, 2021 13:24
How to extract YSTR markers from WGS400 using Last and tandem-genotypes
chrY 24005952 24006099 CCTT CDY 36
chrY 3284190 3284260 TATTT DXYS156 12
chrY 18278507 18278547 AAC DYF395 15
chrY 22022402 22022434 AAT DYF397 11
chrY 22950279 22950378 AAAG DYF399 24
chrY 23739648 23739707 GAAA DYF401 15
chrY 21681709 21681748 TATC DYF406S1 10
chrY 17912087 17912121 ATAG DYF408 8
chrY 18486683 18486740 AAAGG DYF411 11
chrY 20731419 20731492 CTTT DYR1 17
@tkrahn
tkrahn / identityResolutionTemplateCreator.py
Created July 27, 2020 18:35
Convert a novel SNP VCF to a usable CSV table for manually checking novel SNPs
# -*- coding: utf-8 -*-
"""
Created on Wed Feb 12 14:13:00 2020
@author: hunte
"""
import sys
def createBED(pos, bedFile):
@tkrahn
tkrahn / bwa_hg38_pipeline_release.sh
Created July 27, 2020 18:25
YSEQ WGS processing pipeline with which we are currently analyzing FastQ files.
#!/bin/bash
START=$(date +%s.%N)
clear
# setup parameters
YSEQID=${PWD##*/}
# YSEQID="1234" # (the above command simply gets the name of the last segment of the current working directory)
NUM_THREADS=$(getconf _NPROCESSORS_ONLN)
@tkrahn
tkrahn / 23andMe_hg38_pipeline.sh
Created May 2, 2020 00:40
Generating a (hg19) 23andMe file from a hg38 BAM file using CrossMap
#!/bin/bash
START=$(date +%s.%N)
clear
# setup parameters
YSEQID=${PWD##*/}
# YSEQID="1234" # (the above command simply gets the name of the last segment of the current working directory)
NUM_THREADS=$(getconf _NPROCESSORS_ONLN)
echo "We can use ${NUM_THREADS} threads."
@tkrahn
tkrahn / cladefinder_json.html
Created April 22, 2020 14:08
Here is a form to submit a request to YSEQ Cladefinder to get a JSON Object as a response
<!DOCTYPE html>
<html>
<head>
</head>
<body>
Here is a form to submit a request to YSEQ Cladefinder to get a JSON Object as a response:<br>
<form action="https://cladefinder.yseq.net/json.php" method="post">
<label for="input">SNP String:</label><br>
<input type="text" name="input" value="L21+, DF13-"><br>
<label for="json">Options:</label><br>
@tkrahn
tkrahn / cladefinder_iframe.html
Created April 22, 2020 09:42
Example to embedd YSEQ Cladefinder in an iframe
<html>
<head>
<style>
#cladefinder {
height: 100px;
width: 300px;
overflow-x: auto;
overflow-y: hidden;
resize: both;
position: relative;
@tkrahn
tkrahn / link2cladefinder.html
Created April 22, 2020 08:57
Example how to link to YSEQ Cladefinder
<html>
<body>
Here is a Link to <a href="https://cladefinder.yseq.net/interactive_tree.php?snps=L21%2B">L21</a><br>
The "Percent 2B" represents the encoding for the plus sign.
</body>
</html>
@tkrahn
tkrahn / parallelMpileup.sh
Created April 7, 2020 01:19
Parallel SNP calling by chromosome
#!/bin/bash
NUM_THREADS=$(getconf _NPROCESSORS_ONLN)
REF="MyReference.fa"
BAMFILE_SORTED="MySortedAndIndexed.bam"
VCF_FILE="My.vcf"
# Parallel SNP calling by chromosome
bcftools mpileup -r chr1 -Ou -C 50 -f $REF $BAMFILE_SORTED | bcftools call -O z --threads $NUM_THREADS -v -V indels -m -P 0 > chr1_${VCF_FILE}.gz &
bcftools mpileup -r chr2 -Ou -C 50 -f $REF $BAMFILE_SORTED | bcftools call -O z --threads $NUM_THREADS -v -V indels -m -P 0 > chr2_${VCF_FILE}.gz &
bcftools mpileup -r chr3 -Ou -C 50 -f $REF $BAMFILE_SORTED | bcftools call -O z --threads $NUM_THREADS -v -V indels -m -P 0 > chr3_${VCF_FILE}.gz &
@tkrahn
tkrahn / minimap2_hg38_pipeline.sh
Last active September 22, 2019 22:04
This is my (simplified) pipeline to map Nanopore FastQ reads to a reference genome (here hg38). Note that this is certainly not the best way to use Nanopore results. It's just a quick check how the results look.
#!/bin/bash
START=$(date +%s.%N)
clear
# setup parameters
YSEQID=${PWD##*/}
# YSEQID="1234" # (the above command simply gets the name of the last segment of the current working directory)
NUM_THREADS=$(getconf _NPROCESSORS_ONLN)
echo "We can use ${NUM_THREADS} threads."