Skip to content

Instantly share code, notes, and snippets.

View seandavi's full-sized avatar

Sean Davis seandavi

View GitHub Profile
@seandavi
seandavi / defuse.txt
Created April 30, 2014 17:53
Rule for running deFuse fusion finder on the NIH biowulf cluster
rule defuse:
input: source2fastq,'/usr/local/apps/defuse/current/scripts/config_hg19.txt'
output: ["bam/{source}/RNA/defuse/"+fname for fname in ['results.classify.tsv','results.filtered.tsv','results.tsv']]
params: batch="-q ccr -l nodes=1:gpfs",
outdir="bam/{source}/RNA/defuse",
scratchdir = '/scratch'
priority: 10
threads: 16
version: "0.6.1"
log: "bam/{source}/RNA/defuse/defuse_rule.log"
@seandavi
seandavi / runCrestParallel.py
Last active August 29, 2015 14:01
Run CREST in parallel using scratch space on NIH Biowulf
#!/usr/bin/env python
import argparse
import shutil
import subprocess
import os
parser = argparse.ArgumentParser()
parser.add_argument('tumorbam',help='full path to the Tumor BAM file')
parser.add_argument('normalbam',help='full path to the Normal BAM file')
parser.add_argument('twobitfile',help='full path to 2bit file for reference genome')
@seandavi
seandavi / runner.sh
Last active August 29, 2015 14:07
Very small script to run scripts in the package directory of an R package. The scripts do not have to be R, but they do have to be executable.
#!/usr/bin/Rscript
# chmod +x runner.sh
# runner.sh "PACKAGENAME:path/to/script" <args to pass to script>
# Example (fake):
# runner.sh "GEOquery:scripts/abc.sh" arg1 arg2 arg3
#
args = commandArgs(trailingOnly=TRUE)
arg1 = strsplit(args[1],':')[[1]]
if(length(arg1)!=2) {
stop("The first argument must be of the form 'PACKAGENAME:path/to/script.sh'")
@seandavi
seandavi / Instructions.org
Created December 14, 2014 20:00
Set up serpentine on the Dell cluster

Serpentine

To get started, you need to create a virtual environment for snakemake. The following code will do this for you.

module load python/3.4.2
pyvenv snakemake
source snakemake/bin/activate
pip install snakemake
---
title: "Clinomics"
author: "Sean Davis"
date: February 12, 2015
output: html_document
---
```{r style, echo = FALSE, results = 'asis',cache=FALSE}
suppressPackageStartupMessages(library(GenomicRanges))
BiocStyle::markdown()
@seandavi
seandavi / gist:d6fb151e07fc743e1eb4
Created April 20, 2015 20:19
Get all SRA files associated with one or more SRA accessions using SRAdb
# Run this in R
# Supply any SRA accessions that you like.
# The .sra files will be downloaded.
# The actual sequence will need to be extracted
# with the SRA SDK (for example fastq-dump)
source('http://bioconductor.org/biocLite.R')
biocLite('SRAdb')
library(SRAdb)
sradb = getSRAdbFile()
con = dbConnect(SQLite(),sradb)
.require <-
function(pkg)
{
withCallingHandlers({
suppressPackageStartupMessages({
require(pkg, character.only=TRUE, quietly=TRUE)
})
}, warning=function(w) {
invokeRestart("muffleWarning")
}) || {
@seandavi
seandavi / AWS.ssh.sh
Last active August 29, 2015 14:23
Command-line for executing command on all public DNS EC2 images
#!/bin/bash
aws ec2 describe-instances | \
jq -r '.Reservations[].Instances[].PublicDnsName//empty'| \
parallel -j 8 -I {} ssh -X -o StrictHostKeyChecking=no -i ~/.ssh/MYKEY.pem USEANAME@{} 'sudo apt-get install zip'
@seandavi
seandavi / xmlparse.Rmd
Last active August 29, 2015 14:27
Playing with XML parsing in R
---
author: Sean Davis
date: 8/12/2015
title: "XML play"
output: html_document
---
Load the XML library and set the filename. You can loop over filenames if you like.
```{r}
@seandavi
seandavi / DAX3.py
Created August 20, 2015 20:25
Python 2 and 3 conversion of Pegasus DAX3.py
# Copyright 2010 University Of Southern California
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an "AS IS" BASIS,