Skip to content

Instantly share code, notes, and snippets.

View fstrozzi's full-sized avatar

Francesco Strozzi fstrozzi

View GitHub Profile
# Internal method to build argument list for BWA C functions
# @note this method should not be called directly
def self.build_args_for_BWA(args)
cmd_args = args.map do |arg|
FFI::MemoryPointer.from_string(arg.to_s) # convert every parameters into a string and then into a memory pointer
end
exec_args = FFI::MemoryPointer.new(:pointer, cmd_args.length) # creating a pointer to an array of pointers
cmd_args.each_with_index do |arg, i|
exec_args[i].put_pointer(0, arg) # filling in the array of pointers
end
@fstrozzi
fstrozzi / gist:1336062
Created November 3, 2011 08:43
Simple append for FastQ IDs
ARGF.each do |line|
line.chomp!
if line.start_with? "@HISEQ"
a = line.split("\s")
a[0] += "_1"
puts a.join("\s")
else
puts line
end
end
@fstrozzi
fstrozzi / gist:1336449
Created November 3, 2011 13:18
JRuby --profile on a 17Gb FastQ file with this script https://gist.github.com/1336062
jruby 1.6.5 (ruby-1.9.2-p136) (2011-10-25 9dcd388) (Java HotSpot(TM) 64-Bit Server VM 1.6.0_27) [linux-amd64-java]
main thread profile results:
Total time: 3497.12
total self children calls method
----------------------------------------------------------------
3495.72 366.68 3129.03 1 Object(singleton)#each
2431.95 206.55 2225.40 283220008 Kernel#puts
2225.40 2225.40 0.00 566440018 IO#write
def thor_task(klass, task_name, &block)
if program.nil?
warn "WARNING: no program is associated with #{class_name.upcase} task, does not make sense to create a thor task."
return nil
end
if klass
wrapper = self
klass.class_eval do
wrapper.options.each_pair do |name, opt|
method_option name, opt
class Trim
include Bio::Command::Wrapper
set_program Bio::Ngs::Utils.binary("fastx_trimmer")
use_aliases
add_option :first_base, :type => :numeric, :aliases => "-f", :desc => "First base to keep"
add_option :last_base, :type => :numeric, :aliases => "-l", :desc => "Last base to keep"
add_option :compress, :type => :boolean, :aliases => "-z", :desc => "Compress output with GZIP"
add_option :input, :type => :string, :aliases => "-i", :desc => "Input FASTA/Q file", :collapse => true
add_option :output, :type => :string, :aliases => "-o", :desc => "Output FASTA/Q file", :collapse => true
add_option :trim, :type => :numeric, :aliases => "-t", :desc => "Trim N nucleotides from the end of the read"
desc "quant GTF OUTPUTDIR BAM ", "Genes and transcripts quantification"
Bio::Ngs::Cufflinks::Quantification.new.thor_task(self, :quant) do |wrapper, task, gtf, outputdir, bam|
wrapper.params = task.options
wrapper.params = {"num-threads" => 6, "output-dir" => outputdir, "GTF" => gtf }
wrapper.run :arguments=>[bam], :separator => "="
end
@fstrozzi
fstrozzi / gist:a6dd0daba816ea2fbd5d
Last active August 29, 2015 14:10
Using HTSJDK from JRuby to parse a BAM file and filter for number of mismatches per read
#!/usr/bin/env ruby
require "htsjdk-1.125.jar" # https://github.com/broadinstitute/picard/releases/download/1.125/picard-tools-1.125.zip
require "logger"
java_import "htsjdk.samtools.SAMFileReader"
java_import "htsjdk.samtools.SAMFileWriterFactory"
logger = Logger.new(STDOUT)
@fstrozzi
fstrozzi / biomaRt.r
Last active December 14, 2015 14:21
Example BiomaRt with R
library(biomaRt)
## ensembl=useMart("ensembl")
## listDatasets(ensembl) # show all the possible databases on Ensembl
ensembl = useEnsembl(biomart="ensembl",dataset="btaurus_gene_ensembl")
## listAttributes(ensembl) # show the attributes of the database
@fstrozzi
fstrozzi / fastq_reader.scala
Created December 15, 2015 11:15
Example Scala reading FastQ
import net.sf.picard.fastq.FastqReader
import java.io.File
import scala.collection.JavaConversions._
object Reader extends App {
val file = new File(args(0))
val fr = new FastqReader(file)
printToFile(new File(args(1)))(p => {
for (seq <- fr.iterator()) {
val header = seq.getReadHeader.split(" ")
@fstrozzi
fstrozzi / filter_vcf.py
Created February 10, 2016 11:57
PyVCF example using Cython and BGZIP compressed output
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys
import os
import cyvcf
import Bio.bgzf
vcf_input = sys.argv[1]