Skip to content

Instantly share code, notes, and snippets.

View pansapiens's full-sized avatar

Andrew Perry pansapiens

  • Melbourne, Australia
View GitHub Profile
@pansapiens
pansapiens / bam2fq_softclip.py
Created June 13, 2024 03:22
Extract mapped reads from BAM, lowercase masking soft-clipping
#!/usr/bin/env python
###
# bam2fq_softclip.py
###
# A script for exploring soft-clipping of aligned reads.
# Extracts mapped reads from a BAM file as a FASTQ, but encodes 'soft-clipped' regions
# as lowercase. Soft-clipped regions can be quickly visualized in the terminal like:
#
# ./bam2fq_softclip.py aligned.bam | grep --color=always [atcg] |less -R
@pansapiens
pansapiens / .a_README.md
Last active June 13, 2024 03:25
schollz/hostyoself docker-compose
@pansapiens
pansapiens / find_adapters.sh
Created April 20, 2024 10:33
Find likely adapter sequences in a set of paired end FASTQs
#!/bin/bash
# Given a directory of fastq files, use bbmerge to find adapters sequences for every sample,
# align and find the consensus.
# Final consensus is in r1_adapter.consensus.fa and r2_adapter.consensus.fa - trim by hand
# if the automatic trimming up the the first 'n' isn't sensible
# Requires: bbmerge.sh (bbmap), muscle, emboss (cons)
@pansapiens
pansapiens / R-kill-vscode-remote.sh
Last active April 29, 2024 06:02
tmux persistent R session for vscode-R
#!/bin/bash
R_TMUX_SESSION_NAME="${R_TMUX_SESSION_NAME:-vscode-r}"
TMUX_CMD=$(command -v tmux)
if ! [ -x "$(command -v tmux)" ]; then
echo "Error: tmux is not installed."
exit 1
fi
@pansapiens
pansapiens / README.md
Created November 9, 2023 03:53
Gzip compression heatmap

Gzip compression heatmap

Generates a heatmap plot showing the compression ratio of different block through a file.

Show interesting patterns in FASTQ files, and may be useful for diagnosing pathological or unusual data. I find using the zscore transformation for the plot is more informative.

Run like:

./compression_heatmap.py -b 1048576 -c rainbow -t zscore SRR11794587_2.fastq.gz
@pansapiens
pansapiens / deinterleave_mgi_lanes.sh
Created July 25, 2023 07:10
Deinterleave MGI FASTQ lanes
#!/bin/bash
# This script deinterleaves a FASTQ file generated by an MGI sequencer with two flowcell lanes
# where headers are in the format: @v300009551L1C002R003000000/1 with L1 or L2 indicating the lane
# This can be useful for some analyses since each lane can behave like a technical replicate (eg DADA2 error correction?)
set -e
set -o pipefail
input_fastq_gz="$1" # input FASTQ file
samplename="$(basename $input_fastq_gz .fastq.gz)"
@pansapiens
pansapiens / ssh_config_split.py
Created March 27, 2023 09:45
Split ~/.ssh/config file into separate per-host files
#!/usr/bin/env python
import os
import shutil
import re
import sys
import difflib
import argparse
def parse_ssh_config(config_path):
@pansapiens
pansapiens / get_latest_container.sh
Created August 17, 2022 02:29
Grab the latest Singularity container for a bioconda package
#!/bin/bash
#
# ./get_latest_container.sh
# Given a bioconda package name as the first argument, downloads
# the most recent corresponding Singularity image from quay.io/biocontainers
#
# Usage:
#
# ./get_latest_container.sh <package_name>
#
@pansapiens
pansapiens / Dockerfile
Created March 23, 2021 04:26
rnasik in docker (via conda/mamba)
FROM continuumio/miniconda3
#
# docker build -t pansapiens/rnasik:latest -t pansapiens/rnasik:1.5.4 .
# docker run --rm -it -v $(pwd):/data pansapiens/rnasik --help
#
RUN conda install --yes -c conda-forge mamba
RUN mamba install --yes -c bioconda -c conda-forge -c serine -c serine/label/dev -c anaconda -c defaults -c r -c conda-forge/label/broken \
python=3.6 rnasik=1.5.4 qualimap=2.2.2b=0 "pandas<1" && \
@pansapiens
pansapiens / test_signals.ipynb
Created November 26, 2020 03:49
test_signals.ipynb
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.