Skip to content

Instantly share code, notes, and snippets.

View darencard's full-sized avatar

Daren Card darencard

View GitHub Profile
@darencard
darencard / auto_git_file.md
Last active January 6, 2024 10:33
Automatic file git commit/push upon change

Please see the most up-to-date version of this protocol on my blog at https://darencard.net/blog/.

Automatically push an updated file whenever it is changed

Linux

  1. Make sure inotify-tools is installed (https://github.com/rvoicilas/inotify-tools)
  2. Configure git as usual
  3. Clone the git repository of interest from github and, if necessary, add file you want to monitor
  4. Allow username/password to be cached so you aren't asked everytime
@darencard
darencard / maker_genome_annotation.md
Last active March 7, 2024 08:50
In-depth description of running MAKER for genome annotation.

Please see the most up-to-date version of this protocol on my blog at https://darencard.net/blog/.

Genome Annotation using MAKER

MAKER is a great tool for annotating a reference genome using empirical and ab initio gene predictions. GMOD, the umbrella organization that includes MAKER, has some nice tutorials online for running MAKER. However, these were quite simplified examples and it took a bit of effort to wrap my head completely around everything. Here I will describe a de novo genome annotation for Boa constrictor in detail, so that there is a record and that it is easy to use this as a guide to annotate any genome.

Software & Data

Software prerequisites:

  1. RepeatModeler and RepeatMasker with all dependencies (I used NCBI BLAST) and RepBase (ver
@darencard
darencard / config_jbrowse.md
Last active December 14, 2017 17:55
Configuring JBrowse to display gene annotation tracks

Configuring JBrowse

JBrowse is a handy genome browser and is especially useful for viewing the results of iterative rounds of MAKER. The documentation is decent, but for those not used to creating a data server, it can be difficult to understand. I struggled a bit at first.

Software & Data

Software

  1. JBrowse version 1.12.3 (though other versions should work just fine)

Data

@darencard
darencard / plotly_tutorial_offline_beers.ipynb
Created July 11, 2017 18:11
Nice intro to Plotly in Python
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@darencard
darencard / popstats_from_vcf.Md
Created July 17, 2017 16:16
Calculating population genetic statistics from VCF files using BCFtools

Useful Oneliners for Calculating Population Genetic Statistics from VCF files

The following commands require non-standard software like BCFtools and VCFtools.

thin variants to prevent linkage biases and output the number of sampled alleles and the allele frequency for the reference allele

vcftools --thin 10000 --recode --recode-INFO-all --stdout --gzvcf <my_variants.vcf.gz> | \
  bcftools query -f '%CHROM\t%POS[\t%GT]\n' - | \
 awk -v OFS="\t" '{ miss=0; hom_ref=0; hom_alt=0; het=0; \
@darencard
darencard / active_heatmap.R
Created August 1, 2017 18:57
R functions for creating interactive heatmap using Plotly (now packages exist to do this)
# install.packages(c("plotly", "reshape2", "ggdendro"))
# devtools::install_github("sjmgarnier/viridis")
library(ggplot2)
library(ggdendro)
library(plotly)
library(viridis)
# helper function for creating dendograms
ggdend <- function(df) {
@darencard
darencard / gdrive_download
Created August 1, 2017 18:58
Script to download files from Google Drive using Bash
#!/usr/bin/env bash
# gdrive_download
#
# script to download Google Drive files from command line
# not guaranteed to work indefinitely
# taken from Stack Overflow answer:
# http://stackoverflow.com/a/38937732/7002068
gURL=$1
@darencard
darencard / SLiM_intro_annotation_simulation.Md
Created August 25, 2017 15:52
Introduction to forward-time simulations using SLiM

Some introductory notes on forward-time population genetic simulations using SLiM

SLiM is a newer, powerful piece of population genetic simulation software that is well documented, user-friendly, flexible, and has a pretty sweet GUI interface (plus command-line capability). The following script represents an initial dummy simulation situation I created as I got my feet wet with SLiM, and I added many notes to make it clear what each command was doing.

// in SLiM context are comments.

// set up a simple neutral simulation
initialize() {
	initializeMutationRate(1e-7);
@darencard
darencard / gnuplot_quickstart.md
Created August 31, 2017 14:20
A quick-start guide for using gnuplot for in-terminal plotting

A quick-start guide for using gnuplot for in-terminal plotting

Sometimes it is really nice to just take a quick look at some data. However, when working on remote computers, it is a bit of a burden to move data files to a local computer to create a plot in something like R. One solution is to use gnuplot and make a quick plot that is rendered in the terminal. It isn't very pretty by default, but it gets the job done quickly and easily. There are also advanced gnuplot capabilities that aren't covered here at all.

gnuplot has it's own internal syntax that can be fed in as a script, which I won't get into. Here is the very simplified gnuplot code we'll be using:

set terminal dumb size 120, 30; set autoscale; plot '-' using 1:3 with lines notitle

Let's break this down:

@darencard
darencard / extract_fastq_bam.md
Last active July 14, 2023 06:19
Extract paired FASTQ reads from a BAM mapping file

Please see the most up-to-date version of this protocol on my blog at https://darencard.net/blog/.

Extracting paired FASTQ read data from a BAM mapping file

Sometimes FASTQ data is aligned to a reference and stored as a BAM file, instead of the normal FASTQ read files. This is okay, because it is possible to recreate raw FASTQ files based on the BAM file. The following outlines this process. The useful software samtools and bedtools are both required.

From each bam, we need to extract:

  1. reads that mapped properly as pairs
  2. reads that didn’t map properly as pairs (both didn’t map, or one didn’t map)