Skip to content

Instantly share code, notes, and snippets.

View justincbagley's full-sized avatar
🎯
Focusing

Justin C. Bagley justincbagley

🎯
Focusing
View GitHub Profile

How To Install RADseq SNP-calling Software On A Linux Supercomputer

Justin C. Bagley, February 12, 2018, Richmond, VA, USA

Below, you will find a series of steps for installing on Linux three of the main software programs used for de novo assembly and SNP calling from RADseq-type next-generation sequencing data (given in no particular order): 1) ipyrad (Eaton 2014; Eaton and Overcast 2016), 2) dDocent (Puritz et al. 2014), and 3) Stacks v2.0beta (Catchen et al. 2013), and their dependencies. We will install all three because, even though a given RADseq paper will inevitably present results based on SNPs gleaned mainly from using one of these programs, Stacks is a dependency of dDocent and it is a good idea to compare results from multiple assemblers to ensure they converge on a similar number of SNPs.

As in my other Gists, "$" in the code below references the Linux prompt, which is not to be typed if following along with me. Within code snippets, lines just below a line starting with

How To Setup A Linux Supercomputer Account for RAD-seq Analysis

Justin C. Bagley, September 5, 2017, Richmond, VA, USA

What I describe here are a series of steps for setting up a Linux supercomputer account for RAD-seq (e.g. ddRAD-seq, 2bRAD) analysis, essentially assuming that you had been handed a new account and were starting from scratch. Part of the narrative is given in first person, reflecting my experiences when doing this recently on the VCU CHiPC's Godel supercomputer; other parts are written in third person as straightforward procedues/advice. In all code examples that follow, "$" is the UNIX/Linux prompt; this was not typed and shouldn't be typed if following along with this Gist. Within code snippets, lines just below a line starting with the prompt, but that do not start with the prompt, are output to screen and likewise also should not be typed as input. The pound sign comments out the remainder of a line, allowing for comments and notes to be added; some o

How I Installed BEAGLE Library On Linux

Justin C. Bagley, September 11, 2017, Richmond, VA, USA

Preliminaries

I've previously written blog posts on my website (www.justinbagley.org) about setting up Mac and Windows computers for running BEAST (Bayesian Evolutionary Analysis Sampling Trees) here and here, and those posts touched on installing BEAGLE. This Gist provides some notes on how I recently installed the BEAGLE API/library for scientific computing on a Linux supercomputer.

In all code examples that follow, "$" is the UNIX/Linux prompt; this was not typed and shouldn't be typed if following along with this Gist. Lines that do not start with the prompt are output to screen and likewise also should not be typed as input. The pound sign comments out the remainder of a line, allowing for note

@justincbagley
justincbagley / Method_for_Analyzing_ddRADseq_Data_in_SNAPP.md
Last active May 11, 2020 16:30
This Gist describes how to process and analyze SNPs from ddRAD tag loci in SNAPP (BEAST2)

Method for Analyzing ddRADseq Data in SNAPP (BEAST v2.4++)

Justin C. Bagley, September 11, 2017, Richmond, VA, USA

This markdown note describes how I used several software programs to process and eventually analyze SNPs from ddRAD tag loci (contigs) in SNAPP (Bryant et al. 2012), which is implemented in BEAST (Drummond et al. 2012; Bouckaert et al. 2014) and is of broad interest in evolutionary biology for inferring species trees (e.g. Demos et al. 2015; Stange et al. 2017). I provide a perspective based on my experiences analyzing data generated using Next-Generation Sequencing on ddRADseq genomic libraries prepped for several species/lineages of Neotropical freshwater fishes from the Brazilian Cerrado (Central Brazil).

My account is given in first person and represents merely one way to analyze data in SNAPP; there are other approaches, and other documents (e.g. this BFD* tutorial; L

Convert Word (.docx) to LaTeX using Pandoc

This is easy to do using Pandoc. Here's an example using an imaginary scientific manuscript file named "manuscript.docx" on my MacBook Pro; execute the following in Terminal:

pandoc /path/to/manuscript.docx -o manuscript.tex

This outputs a .tex file that can be edited and typeset in a LaTeX editor like TeXMaker or TeXShop

@justincbagley
justincbagley / Correction_About_Tine2014_∂a∂i_Code-Version.md
Last active August 28, 2017 04:25
Correction About Tine et al.'s ∂a∂i Code/Version

August 28, 2017, Justin Bagley, Richmond, VA

In a previous Gist, I described a complicated set of operations for installing and running Tine et al.'s (2014) modified version of ∂a∂i v1.6... in which I suggested/implied that it was necessary to install their mod'd version to estimate heterogeneous migration rates. It turns out that this was wrong. ∂a∂i v1.7+ can handle Tine et al.'s (2014) code directly within the regular Python input files submitted to the program. So, all you need to do is install the latest version of ∂a∂i from Sourceforge and use the correct formatting/code in your input files. There is no need to go through my instructions to install Tine et al.'s ∂a∂i version.

@justincbagley
justincbagley / How_to_Set_Substitution_Models_in_Seq-Gen.md
Last active August 28, 2017 04:17
How to Set DNA Substitution Models in Seq-Gen

How to Set DNA Substitution Models in Seq-Gen

August 27, 2017, Justin C. Bagley, Richmond, VA

In this Gist, I briefly provide some examples of how to set DNA substitution models in the program Seq-Gen (Rambaut and Grassly 1997). This software is available for download through Andrew Rambaut's website, and its infrequent development can also be tracked on GitHub at the Seq-Gen GitHub repository.

HKY + G

Here is an example using an alpha shape parameter of 0.5 (-a) for gamma-distributed rate heterogeneity, 4 discrete gamma categories (-g), empirical (fixed) base frequencies (-f), and a Ts:Tv ratio of 1.5 (-t): seqgencommand = -mHKY -l9077 -a0.5 -g4 -f0.314,0.198,0.218,0.270 -t1.5

@justincbagley
justincbagley / How_to_Convert_Markdown_to_PDF.md
Last active April 15, 2024 19:50
How To Convert Markdown to PDF

How to convert markdown to PDF:

This post reviews several methods for converting a Markdown (.md) formatted file to PDF, from UNIX or Linux machines.

Using Pandoc:

$ pandoc How_I_got_svg-resizer_working_on_Mac_OSX.md -s -o test1.pdf
@justincbagley
justincbagley / How_I_Got_svg-resizer_Working_on_Mac.md
Created May 3, 2017 18:05
How I Got svg-resizer Working On Mac

How I got svg-resizer working on Mac OSX:

svg-resizer is a command line interface Javascript utility useful for single or batch resizing SVG files (with '.svg' file extensions). It's great for OSX, and I am attempting to use it in my latest shell scripts. Below I provide notes on the series of steps that I used to install svg-resizer and get it working on my macOS Sierra machine, on date February 20, 2017.

1. Install librsvg2 using Homebrew

@justincbagley
justincbagley / Notes_on_Analyzing_RADseq_SNPs_in_SNAPP.md
Last active April 10, 2020 14:17
Notes on Analyzing ddRADseq SNP data in SNAPP (BEAST module)

Notes on Analyzing ddRADseq SNP Data in SNAPP (BEAST v2.4++)

This markdown note describes how I used several software programs to process and eventually analyze SNPs from ddRAD tag loci in SNAPP (Bryant et al. 2012). The data were generated using Next-Generation Sequencing on ddRADseq genomic libraries prepped for several species/lineages of Neotropical freshwater fishes from the Brazilian Cerrado (Central Brazil).

My account is given in first person and represents merely one way to analyze data in SNAPP; there are other approaches, and other documents (e.g. the Leaché et al. BDF* tutorial doc) also present a general approach. However, all the brief SNAPP guides and tutorials that are currently available require the user to consult the manual, A Rought Guide to SNAPP, written by Bouckaert and Bryant. Since SNAPP is amply covered by Bryant et al. (2012), Leaché et al. (2014), and other papers, I'll skip the introduction to SNAPP and assume the reader is acquainted with the details of the m