Skip to content

Instantly share code, notes, and snippets.

View wjs20's full-sized avatar

William Steele wjs20

View GitHub Profile
@wjs20
wjs20 / revcomp
Last active August 13, 2025 08:16
Reverse complement sequences from the command line
#!/usr/bin/env bash
echo $1 | rev | tr TAGC ATCG
@wjs20
wjs20 / reorder-catvars-pandas
Created July 12, 2025 10:24
Pandas version of fct_reorder from dplyr
# reorder manually
import pandas as pd
df = pd.DataFrame({'x': ['low', 'medium', 'high', 'medium']})
df['x'] = df['x'].astype('category')
df['x'] = df['x'].cat.reorder_categories(['low', 'medium', 'high'], ordered=True)
# reorder by another variable
#!/usr/bin/env bash
#
#
PS3="Select an option: "
select opt in "Option 1" "Option 2" "Quit"; do
echo "You picked $opt"
break
done
@wjs20
wjs20 / pandas-read-pdb.py
Last active September 3, 2025 12:52
A function and
import pandas as pd
from pathlib import Path
COLSPECS = [
(0, 6), (6, 11), (12, 16), (16, 17), (17, 20),
(21, 22), (22, 26), (26, 27), (30, 38), (38, 46),
(46, 54), (54, 60), (60, 66), (72, 76), (76, 78),
(78, 80)
]
@wjs20
wjs20 / sqlite2parquet.py
Created May 6, 2025 18:56
Converts a SQLite database file to a parquet file
#!/usr/bin/env python
import pyarrow.parquet as pq
import pyarrow as pa
import glob
import sqlite3
import polars as pl
from pathlib import Path
from tqdm import tqdm
import sys
@wjs20
wjs20 / gist:117680c13ce0f179a99a06b32f519929
Created February 3, 2025 08:57
Deprotonate PDB files
#!/usr/bin/env bash
#
# A program to remove protons from pdb files
#
mkdir -p $1-deprotonated
fd -t f -e .pdb . $1 |
parallel --eta "pdb_delelem -H {} > $1-deprotonated/{/}"
@wjs20
wjs20 / gist:c16650200b37b562ff4f2833f0c997d6
Last active July 18, 2025 13:50
useful_seq_mappings.py
IMGT_FRAMEWORK = {
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54,
55, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86,
87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 118, 119,
120, 121, 122, 123, 124, 125, 126, 127, 128, 129
}
# Coreset is a minimally varying set of positions (in physical space)
Mononoki
Fira Code
Cascadia Code
Monospace
Comic Mono
Agave
Inconsolata
JetBrains Mono
Cousine
Ubuntu Mono
@wjs20
wjs20 / abcsv2fasta
Created January 3, 2024 10:36
Shell script for converting a csv with antibody heavy and light chains to a fasta file
#!/usr/bin/env bash
# fasta_converter.sh
#
usage() {
echo "Usage: $0 [-i input_file.csv] [-o output_file.fasta]"
echo " -i : Specify the input file (CSV). If omitted, reads from stdin."
echo " -o : Specify the output file (FASTA). If omitted, writes to stdout."
echo " -h : Display this help and exit."
exit 1
}
import os
import requests
from concurrent.futures import ThreadPoolExecutor
from urllib.parse import urlparse
def download_urls(url_list, download_dir, max_threads=5):
if not os.path.exists(download_dir):
os.makedirs(download_dir)
def download_url(url):