Skip to content

Instantly share code, notes, and snippets.

@sbamin
sbamin / convert_to_pdfa
Created July 7, 2021 03:32
convert PDF to PDF/A format
#!/usr/bin/env bash
## Convert PDF to PDF/A
## requires imagemagick and ghostscript
## Main script by Rob Patro
## https://gist.github.com/rob-p/9f52c9722fe65cae6946705babc7cddb
set -euo pipefail
pn="$1"
This gist I write, because I couldn't find step by step instructions
how to install and start postgresql locally and not globally in the
operating system (which would require sudo).
I hope, this will help especially people new to postgresql!
####################################
# create conda environment
####################################

PBJelly gapclosing

PBJelly is a part of the PBSuite package of programs. PBJelly can be used to try to close or shrink gaps that may be present between contigs after scaffolding.

The official documentation is availible at: http://sourceforge.net/p/pb-jelly/wiki/Home/

Most of PBJelly can be configured through an .xml protocol that you have to create with a text editor. It is best to put this in the directory where your output is going. Below is an example of how this xml protocol can look:

<jellyProtocol>

/path/to/reference.fasta

@sbamin
sbamin / gzip_compress_inplace.md
Last active January 29, 2021 18:50
gzip compression to save disk space

gzip compression to save disk space

  • find file types ending .txt, .fq, or .fastq with > 500 MB size; compress using .gz and replace file in-place.

  • better to run script as slurm job.

  • list files and save output.

useful to uncompress at the later date - avoid file extension of log file matching that of find query else .<ext> will get replaced by .<ext>.gz

@sbamin
sbamin / Caddyfile
Created October 11, 2020 18:24
Caddyfile to enable OAuth2 using caddy2, Read details at https://github.com/greenpau/caddy-auth-portal/issues/42
{
http_port 80
https_port 443
# auto_https off
# debug
}
avocado.example.com {
root * /srv
@sbamin
sbamin / chkspace.sh
Created September 25, 2020 22:16
Check disk space before starting each of workflow scripts on HPC, i.e, throttle job(s) to prevent disk reaching 100% of quota
#!/bin/bash
## Script to run immediately after any sbatch job begins
## Throttle disk I/O if fastscratch is near 100 %
## @sbamin
## execute prior to running each of snakemake job
tmpjobstart="$(date +%d%b%y_%H%M%S_%Z)"
echo "${tmpjobstart}: Running slurm prerun.sh on HPC SUMNER"
@sbamin
sbamin / 1_rsync_documentation.md
Last active May 8, 2020 00:46 — forked from KartikTalwar/Documentation.md
Rsync over SSH - (40MB/s over 1GB NICs)

The fastest remote directory rsync over ssh archival I can muster (40MB/s over 1gb NICs)

This creates an archive that does the following:

rsync (Everyone seems to like -z, but it is much slower for me)

  • a: archive mode - rescursive, preserves owner, preserves permissions, preserves modification times, preserves group, copies symlinks as symlinks, preserves device files.
  • H: preserves hard-links
  • A: preserves ACLs
@sbamin
sbamin / example_modulefile.lua
Created May 3, 2020 16:26
Advanced Modulefile using lua syntax
--[[
## Modulefile in lua syntax
## Author: Samir Amin
## Read about Lmod
## https://lmod.readthedocs.io/en/latest/015_writing_modules.html
## https://lmod.readthedocs.io/en/latest/050_lua_modulefiles.html
## https://lmod.readthedocs.io/en/latest/020_advanced.html
--]]
@sbamin
sbamin / grantfinder
Last active December 6, 2021 22:27
Use NIH RePORTER API to fetch grant records.
#!/bin/bash
## Use NIH RePORTER API to fetch grant records
## @sbamin
# usage
show_help() {
cat << EOF
Use NIH RePORTER API to fetch grant records.
@sbamin
sbamin / crons_q2days_slurm.sbatch
Last active May 1, 2020 17:29
cron like jobs in HPC using SLURM scheduler: activating globus endpoint via CLI
#!/bin/bash
#SBATCH --job-name=crons_q2days # name of the job
#SBATCH --chdir=/home/user/logs/crons # the workding dir for each job
#SBATCH --output=/home/user/logs/crons/log_crons_q2days.out # output is sent to logfile, stdout + stderr by default
#SBATCH --error=/home/user/logs/crons/log_crons_q2days.err # output is sent to logfile, stdout + stderr by default
#SBATCH --qos=batch # Job queue
#SBATCH --time=00:10:00 # Walltime in minutes
#SBATCH --mem=2G # Memory requirements in Kbytes
#SBATCH --nodes=1 # Nodes reserved
#SBATCH --ntasks=1 # CPU reserved