Skip to content

Instantly share code, notes, and snippets.

View wflynny's full-sized avatar

Bill Flynn wflynny

View GitHub Profile
@wflynny
wflynny / build_10x_reference.sh
Last active February 5, 2019 20:01
Building 10X reference genomes from Ensembl
# Visit the Ensembl ftp site.
# ftp://ftp.ensembl.org/pub/release-95/
#
# You want to find data under the following two URLs:
# 1. ftp://ftp.ensembl.org/pub/release-95/fasta/[YOUR_SPECIES_HERE]/dna/
# 2. ftp://ftp.ensembl.org/pub/release-95/gtf/[YOUR_SPECIES_HERE]/
#
# The first file of interest is under the fasta URL:
# [YOUR_SPECIES_HERE].[ASSEMBLY].dna.primary_assembly.fa.gz
# or, if that doesn't exist,
@wflynny
wflynny / scanpy_cluster_proportions.py
Last active October 13, 2023 17:42
Stacked barplot of scRNA-seq cluster proportions per sample
import scanpy.api as sc
import matplotlib.pyplot as plt
import seaborn as sns
def get_cluster_proportions(adata,
cluster_key="cluster_final",
sample_key="replicate",
drop_values=None):
"""
Input
@wflynny
wflynny / gist:d6c95deadf0c4d1cce4f01a729314dbb
Created January 24, 2019 21:20
Illumina sequencer identifiers in fastq read headers
# Find myself referring to this thread a lot:
# https://www.biostars.org/p/198143/
# However updating codes with what I see at JAX
@Mxxxx - MiSeq
@Dxxxx - HiSeq 2500
@Kxxxx - HiSeq 4000
@NSxxx - NextSeq 500/550
@Axxxxx - NovaSeq
@wflynny
wflynny / gpfs_expiration_checker.sh
Created January 24, 2019 20:05
Check file lifetime stats on a GPFS
# I usually put this in my ~/.bash_aliases
# A portion of our GPFS storage removes files after 21 days of creation.
# `stat` does not show creation time, so we have to resort to parsing the
# output of `mmlsattr`
ftime() {
# Usage:
# ftime path/to/file
#
# Outputs:
@wflynny
wflynny / cellranger_count_scenarios.sh
Created January 23, 2019 19:55
Cellranger count snippets (version 2)
# Some universal variables
NCELLS=6000
OUTPUT_NAME="nice-name"
FASTQ_DIR="/path/to/fastqs"
REFERENCE_GENOME="/path/to/reference_dir"
[[ -z "${PBS_NUM_PPN}" ]] && NCORES=20 || NCORES=${PBS_NUM_PPN}
# When reads look like:
# sample-name_S?_L00?_R1_001.fastq.gz
# sample-name_S?_L00?_R2_001.fastq.gz