Skip to content

Instantly share code, notes, and snippets.

@cmdcolin
cmdcolin / myloc_plugin.js
Created November 28, 2023 14:35
JBrowse 2 plugin to add navigation to a specific location
/* eslint-disable no-restricted-globals */
;(function () {
class MyLocPlugin {
name = 'MyLocPlugin'
version = '1.0'
install(pluginManager) {
/* do nothing */
}
@cmdcolin
cmdcolin / monorepos.md
Last active September 29, 2023 03:42
monorepos are weird

If a monorepo was "just a set of folders that are basically independen NPM packages" everything is "easy" **

Then it's just as if you had a github repo for each folder in your repo.

** Easy after you understand npm packages that is, which is somewhat non-trivial, but at least there are basically rules each package.json files follows to get things done)

But monorepos quickly go beyond this, in ways that can go from 0 to 100% insane pretty fast

Examples of "basic things that monorepos do" that are actually quite weird

Features of JBrowse 2 for investigating split/supplementary alignments

Breakpoint split view

Connects long split reads across discontiguous chromosome regions. Currently pretty "pairwise-y", connecting one side of a breakend to another

"Linear read vs ref" view

Right click a single read, select "Linear read vs ref" and it will stitch all the parts of the read including split parts from the SA: tag into a single "read vs ref" view

@cmdcolin
cmdcolin / setup_jbrowse2.sh
Last active November 10, 2023 17:22
Set up JBrowse 2 on an ubuntu server
#!/bin/bash
export OUTDIR=/var/www/html/jbrowse2
sudo apt-get update
sudo apt-get install -y ca-certificates curl gnupg
sudo mkdir -p /etc/apt/keyrings
curl -fsSL https://deb.nodesource.com/gpgkey/nodesource-repo.gpg.key | sudo gpg --dearmor -o /etc/apt/keyrings/nodesource.gpg
NODE_MAJOR=20
echo "deb [signed-by=/etc/apt/keyrings/nodesource.gpg] https://deb.nodesource.com/node_$NODE_MAJOR.x nodistro main" | sudo tee /etc/apt/sources.list.d/nodesource.list
#!/bin/bash
export OUTDIR=/var/www/html/jbrowse2
sudo yum install https://rpm.nodesource.com/pub_20.x/nodistro/repo/nodesource-release-nodistro-1.noarch.rpm -y
sudo yum install nodejs -y
sudo yum groupinstall 'Development Tools'
git clone https://github.com/gpertea/gffread
cd gffread

Choices for sample data for JBrowse 2 demo

UCSC human reference

  • Pros: easy accessible files, well known, files are quite permanent on the FTP, so easy to link to
  • Cons: No gff files, so have to convert from gff, which doesn't have gene level grouping of features (just transcripts)
  • Cons: to convert the gtf to gff and preserve gene grouping, might be able to use something like AGAT but it is a laborious installation

Ensembl human reference

@cmdcolin
cmdcolin / plot.R
Created September 5, 2023 18:16
Plot users jbrowse 2
library(tidyverse)
library(reshape2)
df <- read_csv("combined_jb2.csv.gz")
df=dplyr::filter(df, !grepl('localhost', host))
df$date=as.Date(as.POSIXct(df$timestamp / 1000, origin = "1970-01-01"))
df$platform=factor(df$electron, labels = c('web', 'desktop') )
df$month = round_date(df$date, unit='month')
dfk <- df %>%
General outline in graphviz format
digraph G {
jb2 [label="JBrowse 2"]
hg002 [label="HG002 T2T phased genome FASTA"]
hg002_reads [label="HG002 Illumina reads CRAM"]
hg002_fastq [label="HG002 Illumina reads FASTQ"]
hg002_mat [label="HG002 paternal genome FASTA"]
hg002_pat [label="HG002 maternal genome FASTA"]
hg002->hg002_mat [label="Split out to..."]
hg002->hg002_pat [label="Split out to..."]
#!/bin/bash
export JB2DIR=~/src/jbrowse-components/test_data/hg002
export NCBIROOT=https://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/data/AshkenazimTrio/HG002_NA24385_son
## download chr22 from the Illumina 2x250bp reads and 300x reads
samtools view $NCBI_ROOT/NIST_Illumina_2x250bps/novoalign_bams/HG002.hs37d5.2x250.bam 22 -o HG002.hs37d5.2x250.chr22.bam
samtools view $NCBI_ROOT/NIST_HiSeq_HG002_Homogeneity-10953946/NHGRI_Illumina300X_AJtrio_novoalign_bams/HG002.hs37d5.300x.bam 22 -o HG002.hs37d5.300x.chr22.bam
## select chr22 out of the HG002 T2T assembly, contains a full chromosome for PATERNAL and MATERNAL
library(readr)
library(lubridate)
library(ggplot2)
library(scales)
x <- read_tsv("./github_actions.csv")
# cast to date, note: theres already a column called updated_at so just named updated_at2
x$updated_at2 <- as.Date(x$updated_at)
# only look at success/failure, remove action manually "cancelled" and other types of things