Skip to content

Instantly share code, notes, and snippets.

View johandahlberg's full-sized avatar

Johan Dahlberg johandahlberg

View GitHub Profile
@johandahlberg
johandahlberg / PlayWithURLs.scala
Created April 12, 2013 08:28
Playing with getting source from URL, to get gene names from refseq-ids and converting them into R if-statements. Mostly an experiment in using Scalas parallel collections. NOTE: This should under no circumstances be used in production, then you need to be using some API instead. You have been warned.
object PlayWithURLs extends App {
// Some ids for the BRCA1 gene
val refseqids = List("uc010whl.1", "uc002icp.3", "uc010whp.1")
def queryRefSeqId(refSeqId: String): String = {
def read(url: String): String = io.Source.fromURL(url).mkString
val document = read("http://genome.ucsc.edu/cgi-bin/hgGene?hgg_gene=" + refSeqId + "&org=human")
val regexp = """.*Genetic Association Database:\s\<A HREF=.*\>(\w+)\<.*""".r
@johandahlberg
johandahlberg / gist:5459321
Last active December 16, 2015 15:58
Ugly code example for slides on Best practices in scientific computing.
object UglyCodeExample extends App {
def m(stringToTransform: String): String = {
def t(string: String): List[String] = {
def th(string: String, si: Int): List[String] = {
if (si == 0) Nil
else {
val (firstString, secondString) = string.splitAt(si); val newString = secondString + firstString
newString :: th(string, si - 1)}}
th(string, string.length)
}
@johandahlberg
johandahlberg / BetterCodeExample.scala
Last active December 16, 2015 15:59
A better code example for slides on best practice in scientific computing.
object BetterCodeExample extends App {
/**
* Burrows-Wheeler transform of string
*
* Perform the Burrows-Wheeler transform on a string. See http://en.wikipedia.org/wiki/Burrows%E2%80%93Wheeler_transform
* for a introduction to the algorithm.
*
* This particular implementation is recursive.
*
@johandahlberg
johandahlberg / BetterCodeExampleUnitTests.scala
Created April 25, 2013 13:33
Unit test example for the BetterCodeExample burrows wheeler functions
import org.scalatest.FunSuite
import BetterCodeExample._
class BetterCodeExampleUnitTests extends FunSuite {
test("Burrow wheelers transform of ^BANANA|") {
assert(BetterCodeExample.burrowsWheelersTransform("^BANANA|") === """BNN^AA|A""")
}
@johandahlberg
johandahlberg / GCCounter.scala
Last active December 17, 2015 12:39
GCCounter in Scala inspired by http://saml.rilspace.org/moar-languagez-gc-content-in-python-d-fpc-c-and-c. This is no match performance wise for the C/D/C++ etc solutions, but I think that it makes a nice point of showing how Scalas parallel collections can be used to with very little extra effort parallelize operations. Test file available for …
import scala.io.Source
import java.io.File
object GCCounter extends App {
val file = new File("Homo_sapiens.GRCh37.67.dna_rm.chromosome.Y.fa")
// The actual GC counting function
def countGCOnLine(line: String): (Long, Long) = {
if (line.startsWith(">"))
@johandahlberg
johandahlberg / DemoIteratorSkipsRecords.scala
Last active December 25, 2015 12:19
Having problems with the SortingCollection which seems to skip records. Created this gist to use for reference when e-mailing samtools-devel list.
package molmed
import net.sf.samtools.util.SortingCollection
import net.sf.samtools.BAMRecordCodec
import net.sf.picard.fastq.FastqRecord
import java.util.Comparator
import java.io.InputStream
import java.io.PrintStream
import net.sf.picard.fastq.FastqConstants
@johandahlberg
johandahlberg / income_difference.R
Last active December 26, 2015 13:20
Quick plot of income differences between men and women in the period 1991 to 2013 in Sweden.
# Licence:
# https://creativecommons.org/licenses/by/3.0/
library(ggplot2)
library(reshape2)
library(dplyr)
# Data:
# Downloaded at: http://www.statistikdatabasen.scb.se/pxweb/sv/ssd/START__HE__HE0110__HE0110A/SamForvInk2/?rxid=c58583e1-7fc3-418a-9395-0300e138fe7f
# Sammanräknad förvärvsinkomst, medianinkomst för boende i Sverige den 31/12, tkr efter region, kön, ålder och år
@johandahlberg
johandahlberg / prepareReference.sh
Created November 5, 2013 16:18
A script for preparing a reference for running with piper on Uppmax.
#!/bin/bash -l
#SBATCH -A b2010028
#SBATCH -p core
#SBATCH -n 2
#SBATCH -t 10:00:00
#SBATCH -J prepare_ref
#SBATCH -o prepare_ref-%j.out
#SBATCH -e prepare_ref-%j.error
# Author: Johan D
@johandahlberg
johandahlberg / performance_test.sh
Created February 11, 2016 14:04
Quick and dirty performance test of proot at Uppmax
echo "Time in container"
for i in $(seq 1 5)
do
/usr/bin/time -f "time: %e" proot -S debian-sid --bind=/proj/a2009002/webexport/opendata/HiSeqX_CEPH/CEP-13-3/03-BAM/ samtools mpileup -r 22:1-23096112 /proj/a2009002/webexport/opendata/HiSeqX_CEPH/CEP-13-3/03-BAM/CEP-13-3.clean.dedup.recal.bam 2>&1 > test.pileup | grep time
done
echo "Time outside container"
for i in $(seq 1 5)
do
/usr/bin/time -f "time: %e" samtools mpileup -r 22:1-23096112 /proj/a2009002/webexport/opendata/HiSeqX_CEPH/CEP-13-3/03-BAM/CEP-13-3.clean.dedup.recal.bam 2>&1 > test.pileup | grep time
def rawbuffer2JsValue(rawBuffer: RawBuffer): Option[JsValue] = {
for {
bytes <- rawBuffer.asBytes()
} yield {
Json.parse(bytes.utf8String)
}
}
def receiveFacebookMessages = withVerifiedPayload {
Action (parse.raw) {