Skip to content

Instantly share code, notes, and snippets.

View cyklee's full-sized avatar

Kevin Lee cyklee

View GitHub Profile
@cyklee
cyklee / coordinates_conversion.R
Last active June 22, 2016 04:11
Quick conversion from degrees-based latitude/longitude format to decimal coordinates.
# Quick conversion from latitude/longitude formats ("degrees decimal minutes" or "degrees minutes seconds") to decimal coordinates.
# This conversion needs to be verified since I know little of geodesy
# Load the list data
LL <- read.delim("~/Dropbox/POLAR GRADIENTS/latitude_longitude.tsv", quote="")
library(sp)
as.numeric(char2dms(as.character(LL$Latitude), chd="d", chm = "'", chs = "\""))
as.numeric(char2dms(as.character(LL$Longitude), chd="d", chm = "'", chs = "\""))
@cyklee
cyklee / no_qiime.R
Created June 22, 2016 04:27
USERCH to Phyloseq without going through QIIME & the biom format
# This is an exercise in 16S rRNA gene sequence processing in R
library(phyloseq)
library(dada2)
library(Biostrings)
# Load OTU table generated by USEARCH with phyloseq
# Load the "classic" OTU table converted from .uc to .txt via uc2otutab.py
otuFile <- read.delim("example_readmap.txt", header=TRUE, row.names=1)
otu <- otu_table(otuFile,taxa_are_rows = TRUE)
@cyklee
cyklee / unpack.sh
Last active June 29, 2016 05:23
Zsh command that moves all the files in subdirectories in the current directory
mv ./*/**/*(.D) ./
@cyklee
cyklee / compare_alpha_plot.sh
Last active July 6, 2016 23:37
Shell script & instruction to format output from QIIME alpha_diversity.py for compare_alpha_diversity.py
# In order to using the plotting script in `compare_alpha_diversity.py` we need to convert the output from `alpha_diversity.py`
# to resemble that coming out of `collate_alpha.py`, which involves transposing the column of sample name and ouput to rows.
# In theory, we will have only two rows, one for site names and one for the alpha diversity.
# Here's the transposition workflow for my own reproducibility purposes:
awk '
{
for (i=1; i<=NF; i++) {
a[NR,i] = $i
@cyklee
cyklee / tmux.md
Created July 27, 2016 06:18 — forked from andreyvit/tmux.md
tmux cheatsheet

tmux cheat sheet

(C-x means ctrl+x, M-x means alt+x)

Prefix key

The default prefix is C-b. If you (or your muscle memory) prefer C-a, you need to add this to ~/.tmux.conf:

remap prefix to Control + a

@cyklee
cyklee / biom2phyloseq.sh
Last active October 19, 2016 06:09
To load biom format to phyloseq (continue from fastq2biom)
#!/bin/zsh
# Load OTU table generated by fastq2biom into phyloseq
# I'm converting from HDF5 to JSON format because otherwise phyloseq would drop low abundance OTU for me
biom convert -i ${project}_tax.biom -o ${project}_tax.json --to-json
# I have problem where the "id" and "type" field values are malformed:
# My conversion output: "b'No Table ID'" & "b'OTU table'"
# Should be: "No Table ID" & "OTU table"
# Fixing this with sed:
sed -i "s/b'//g; s/'//g" ${project}_tax.json
@cyklee
cyklee / manjaro_vboxsf.sh
Created March 30, 2017 22:06
Enable shared folders in /media in Manjaro Linux
su
systemctl enable vboxservice
systemctl start vboxservice
groupadd vboxsf
gpasswd -a $USER vboxsf
exit
sudo usermod -aG vboxsf $(whoami)
# Log off & log back in
@cyklee
cyklee / Subset_FASTA.sh
Last active May 2, 2017 05:17
To extract a subset of reads from a muti-FASTA file using a list of header names
# https://www.biostars.org/p/49820/
# https://github.com/mdshw5/pyfaidx can be used as a drop-in replacement
xargs samtools faidx test.fa < names.txt
@cyklee
cyklee / import_biom2.R
Created February 22, 2018 01:47 — forked from jnpaulson/import_biom2.R
This will convert a biom class object into a phyloseq object.
import_biom2 <- function(x,
treefilename=NULL, refseqfilename=NULL, refseqFunction=readDNAStringSet, refseqArgs=NULL,
parseFunction=parse_taxonomy_default, parallel=FALSE, version=1.0, ...){
# initialize the argument-list for phyloseq. Start empty.
argumentlist <- list()
x = read_biom(x)
b_data = biom_data(x)
b_data_mat = as(b_data, "matrix")
@cyklee
cyklee / prokkagff2gtf.sh
Created April 12, 2018 05:35
Converts GFF file from Prokka to GTF for htseq-count
#!/bin/bash
infile=$1
if [ "$infile" == "" ] ; then
echo "Usage: prokkagff2gtf.sh <PROKKA gff file>"
exit 0
fi
grep -v "#" $infile | grep "ID=" | cut -f1 -d ';' | sed 's/ID=//g' | cut -f1,4,5,7,9 | awk -v OFS='\t' '{print $1,"PROKKA","CDS",$2,$3,".",$4,".","gene_id " $5}'