Skip to content

Instantly share code, notes, and snippets.

View skchronicles's full-sized avatar
🔬
"There's no place like ${HOME:-/dev/null}"

Skyler Kuhn skchronicles

🔬
"There's no place like ${HOME:-/dev/null}"
View GitHub Profile
@skchronicles
skchronicles / image2html.sh
Created April 21, 2020 14:24
Bash one-liners: covert images to base64 and embed in HTML
# USAGE: image2html /path/to/images/*.png > out.html
function image2html(){
for f in $@; do b=$(cat "$f" | base64 -w 0); echo "<img src=\"data:image/${f##*.};base64,${b}\" alt=\"${f%.*}\">"; done
}
export -f image2html
@skchronicles
skchronicles / xlsx_reader.py
Created April 21, 2020 21:44
xlsx to tsv writer
from __future__ import print_function
import pandas as pd
import sys
usage = '''\
USAGE:
python xlsx_reader.py input.xlsx output_file_prefix [-h]
Positional Arguments:
@skchronicles
skchronicles / s3etag.sh
Created April 23, 2020 22:54
Calculate S3 ETag
#!/bin/bash
set -euo pipefail
help() { cat << EOF
Calculates S3 etag
USAGE:
s3etag [OPTIONS] input_file [chunk_size_in_MB]
@skchronicles
skchronicles / covid.sh
Last active September 24, 2022 03:03
Download the latest SARS-CoV-2 sequences from RefSeq and GenBank
#!/bin/bash
# Functions
download() { echo -e "Saving output to file: $1"; curl --http1.1 --retry 5 --verbose -L 'https://www.ncbi.nlm.nih.gov/genomes/VirusVariation/vvsearch2/?q=*:*&fq=%7B!tag=SeqType_s%7DSeqType_s:(%22Nucleotide%22)&fq=VirusLineageId_ss:(2697049)&cmd=download&sort=SourceDB_s%20desc,CreateDate_dt%20desc&dlfmt=fasta&fl=id,Definition_s,Nucleotide_seq' > "$1" || echo 'Download failed... please try again!'; }
echoerr() { cat <<< "$@" 1>&2; }
help() { cat << EOF
Download the latest SARS-CoV-2 sequence from GeneBank and RefSeq. Please note that providing
the output filename of the downloaded sequences is optional.
from __future__ import print_function, division
import sys, os, re
import pandas as pd
# Configuration for defining valid files, cleaning sample names, parse fields, rename fields
# Add new files to parse and define their specifications below
config = {
".warning": ["\033[93m", "\033[00m"], ".error": ["\033[91m", "\033[00m"],
"multiqc_cutadapt.txt": {
@skchronicles
skchronicles / HERVx.sh
Created September 3, 2020 21:37
Pipeline to characterize Human Endogenous Retrovirus (HERV) expression
#!/usr/bin/env bash
set -euo pipefail
function usage() { cat << EOF
HERVx: Pipeline to characterize Human Endogenous Retrovirus (HERV) expression
USAGE:
HERVx.sh [OPTIONS] -r1 SRR4235541_1.fastq -r2 SRR4235541_2.fastq -o outdir_path
@skchronicles
skchronicles / retry.sh
Last active July 8, 2022 22:10
Retry a BASH command
#!/usr/bin/env bash
function err() { cat <<< "$@" 1>&2; }
function fatal() { cat <<< "$@" 1>&2; exit 1; }
function retry() {
# Tries to run a cmd 5 times before failing
# If a command is successful, it will break out of attempt loop
# Failed attempts are padding with the following exponential
# back-off strategy {4, 16, 64, 256, 1024} in seconds
# @INPUTS "$@"" = cmd to run
@skchronicles
skchronicles / md5.py
Created October 14, 2021 20:00
Real world parallel processing example using Ray
#!/usr/bin/env python3
"""md5.py: calculates md5s of multiple files in parallel.
The md5 calculation is memory safe. It reads in a file
in blocks of 64 KiB.
USAGE:
python3 md5.py file1.txt file2.csv 8