Skip to content

Instantly share code, notes, and snippets.

View nikhilRP's full-sized avatar
🏠
Working from home

Nikhil R Podduturi nikhilRP

🏠
Working from home
  • BI X - The Digital Lab for Boehringer Ingelheim
  • Ingelheim am Rhein, Germany
View GitHub Profile
@nikhilRP
nikhilRP / CMakeLists.txt
Created August 26, 2014 18:14
cmake file for including nvbio as library
include(ExternalProject)
ExternalProject_Add(
nvbio
GIT_REPOSITORY https://github.com/NVlabs/nvbio.git
GIT_TAG v0.9.7
BUILD_COMMAND make nvbio
INSTALL_COMMAND ""
)
# I'm successfull in generating Static library here (libnvbio.a)
OVERVIEW: LLVM 'Clang' Compiler: http://clang.llvm.org
USAGE: clang -cc1 [options] <inputs>
OPTIONS:
-### Print the commands to run for this compilation
--analyze Run the static analyzer
--migrate Run the migrator
--relocatable-pch Build a relocatable precompiled header
--serialize-diagnostics <value>
@nikhilRP
nikhilRP / load.scala
Last active August 29, 2015 14:14
Loading adam file from s3
import org.bdgenomics.formats.avro.AlignmentRecord
import org.bdgenomics.adam.rdd.ADAMContext._
import org.bdgenomics.adam.projections.{ AlignmentRecordField, Projection }
import org.apache.spark.rdd.RDD
var adamFile = "/user/nikhilrp/encoded-data/ENCFF000YGV.adam"
val reads = sc.loadParquetAlignments(adamFile)
@nikhilRP
nikhilRP / loadFeatures.scala
Created February 5, 2015 01:21
loading features (adam) from s3
import org.bdgenomics.formats.avro.Feature
import org.bdgenomics.adam.rdd.ADAMContext._
var adamFile = "s3n://encoded-hdfs/ENCFF001VNW.adam"
val reads = sc.adamLoad[Feature, Nothing](adamFile)
@nikhilRP
nikhilRP / depth.scala
Last active August 29, 2015 14:16
Finding depth of a variant
import org.bdgenomics.formats.avro.{ AlignmentRecord, Contig }
import org.bdgenomics.adam.rdd.ADAMContext._
import org.bdgenomics.adam.rdd.BroadcastRegionJoin
import org.bdgenomics.adam.projections.Projection
import org.bdgenomics.adam.projections.AlignmentRecordField._
import org.bdgenomics.adam.models.{ SequenceDictionary, ReferenceRegion }
import org.bdgenomics.adam.rdd.BroadcastRegionJoin
import org.apache.spark.rdd.RDD
@nikhilRP
nikhilRP / transfer.sh
Created March 5, 2015 23:19
Shell script to download and transfrom ENCODE files to ADAM format
# Files.txt contains URLs for the files
# Make sure you set SPARK_HOME
while read line; do
url=$line
filename=$(basename "$url")
wget "$url"
hadoop fs -put $filename /user/mapr/encoded/
./../adam/bin/adam-submit transform /user/mapr/encoded/$filename /user/mapr/encoded/"${filename/.bam/.adam}"
hadoop fs -rm /user/mapr/encoded/$filename
{
"title": "Suggest",
"@graph": [
{
"text": "BRCA1",
"payload": {
"id": 2386
},
"score": 1
},
@nikhilRP
nikhilRP / adamConvert.scala
Last active August 29, 2015 14:24
Scala class to convert to adam file
import org.bdgenomics.formats.avro.AlignmentRecord
import org.bdgenomics.adam.rdd.ADAMContext._
import org.bdgenomics.adam.projections.{ AlignmentRecordField, Projection }
import org.apache.spark.rdd.RDD
import org.bdgenomics.adam.rdd.ReferencePartitioner
import org.bdgenomics.adam.models.ReferenceRegion
val bamFile = "/user/nikhilrp/ENCFF000QJB.bam"
val reads = sc.loadBam(bamFile)
val sd = reads.adamGetSequenceDictionary()
@nikhilRP
nikhilRP / filter.scala
Last active October 13, 2015 12:52
Utility scala class to load and filter alignments
import org.bdgenomics.formats.avro.AlignmentRecord
import org.bdgenomics.adam.rdd.ADAMContext._
import org.bdgenomics.adam.projections.Projection
import org.apache.spark.rdd.RDD
import org.apache.parquet.filter2.dsl.Dsl._
import org.apache.parquet.filter2.predicate.FilterPredicate
import org.bdgenomics.adam.projections.AlignmentRecordField._
val adamFile = "/user/nikhilrp/encoded-data/mm10/chr1/ENCFF891NNX.adam"
val proj = Projection(readName, contig, start, end, qual)
test_data = [(50, 150), (100, 200), (150, 250), (99, 201), (100, 201), (201, 300), (98, 99)]
def main():
# view region
start = 100
end = 200
print "View region: start - " + str(start) + " end - " + str(end)
print "Method 1 - Nikhil"