Skip to content

Instantly share code, notes, and snippets.

View pansapiens's full-sized avatar

Andrew Perry pansapiens

  • Melbourne, Australia
View GitHub Profile
@pansapiens
pansapiens / ollama-export.sh
Last active September 25, 2024 15:31 — forked from supersonictw/ollama-export.sh
Ollama Model Export Script
#!/bin/bash
# Ollama Model Export Script
# Usage: bash ollama-export.sh vicuna:7b
# License: MIT (https://ncurl.xyz/s/o_o6DVqIR)
# https://gist.github.com/supersonictw/f6cf5e599377132fe5e180b3d495c553
set -e
echo "Ollama Model Export Script"
echo "License: MIT (https://ncurl.xyz/s/RD0Yl5fSg)"
@pansapiens
pansapiens / appengine_leveldb2json.py
Last active July 13, 2024 05:34 — forked from xlfe/export.py
Export from Google App Engine Datastore Backup LevelDB format to JSON flat file
#!/usr/bin/env python2.7
# Export from Google App Engine Datastore Backup LevelDB format to JSON flat file
# Based on: https://gist.github.com/xlfe/af25f160256e4d52f499dee7e8fa212f
##
# 2024 instructions:
##
# Using the Google Cloud console (https://console.cloud.google.com), find "Firestore"
# and export your database to a Cloud Storage "Bucket". Download the content of the Bucket.
@pansapiens
pansapiens / bam2fq_softclip.py
Created June 13, 2024 03:22
Extract mapped reads from BAM, lowercase masking soft-clipping
#!/usr/bin/env python
###
# bam2fq_softclip.py
###
# A script for exploring soft-clipping of aligned reads.
# Extracts mapped reads from a BAM file as a FASTQ, but encodes 'soft-clipped' regions
# as lowercase. Soft-clipped regions can be quickly visualized in the terminal like:
#
# ./bam2fq_softclip.py aligned.bam | grep --color=always [atcg] |less -R
@pansapiens
pansapiens / .a_README.md
Last active June 13, 2024 03:25
schollz/hostyoself docker-compose
@pansapiens
pansapiens / mirdeep_csv_to_counts.py
Created May 21, 2024 08:10
Convert miRDeep2 counts CSV to nicer TSV tables
#!/usr/bin/env python
import argparse
import pandas as pd
import re
import sys
from typing import Optional, List
import logging
import glob
import io
@pansapiens
pansapiens / find_adapters.sh
Created April 20, 2024 10:33
Find likely adapter sequences in a set of paired end FASTQs
#!/bin/bash
# Given a directory of fastq files, use bbmerge to find adapters sequences for every sample,
# align and find the consensus.
# Final consensus is in r1_adapter.consensus.fa and r2_adapter.consensus.fa - trim by hand
# if the automatic trimming up the the first 'n' isn't sensible
# Requires: bbmerge.sh (bbmap), muscle, emboss (cons)
@pansapiens
pansapiens / R-kill-vscode-remote.sh
Last active April 29, 2024 06:02
tmux persistent R session for vscode-R
#!/bin/bash
R_TMUX_SESSION_NAME="${R_TMUX_SESSION_NAME:-vscode-r}"
TMUX_CMD=$(command -v tmux)
if ! [ -x "$(command -v tmux)" ]; then
echo "Error: tmux is not installed."
exit 1
fi
@pansapiens
pansapiens / README.md
Created November 9, 2023 03:53
Gzip compression heatmap

Gzip compression heatmap

Generates a heatmap plot showing the compression ratio of different block through a file.

Show interesting patterns in FASTQ files, and may be useful for diagnosing pathological or unusual data. I find using the zscore transformation for the plot is more informative.

Run like:

./compression_heatmap.py -b 1048576 -c rainbow -t zscore SRR11794587_2.fastq.gz
@pansapiens
pansapiens / deinterleave_mgi_lanes.sh
Created July 25, 2023 07:10
Deinterleave MGI FASTQ lanes
#!/bin/bash
# This script deinterleaves a FASTQ file generated by an MGI sequencer with two flowcell lanes
# where headers are in the format: @v300009551L1C002R003000000/1 with L1 or L2 indicating the lane
# This can be useful for some analyses since each lane can behave like a technical replicate (eg DADA2 error correction?)
set -e
set -o pipefail
input_fastq_gz="$1" # input FASTQ file
samplename="$(basename $input_fastq_gz .fastq.gz)"
@pansapiens
pansapiens / ssh_config_split.py
Created March 27, 2023 09:45
Split ~/.ssh/config file into separate per-host files
#!/usr/bin/env python
import os
import shutil
import re
import sys
import difflib
import argparse
def parse_ssh_config(config_path):