Skip to content

Instantly share code, notes, and snippets.

@hammer
hammer / HelloAvro.scala
Last active October 17, 2022 04:16
Concise example of how to write an Avro record out as JSON in Scala
import java.io.{IOException, File, ByteArrayOutputStream}
import org.apache.avro.file.{DataFileReader, DataFileWriter}
import org.apache.avro.generic.{GenericDatumReader, GenericDatumWriter, GenericRecord, GenericRecordBuilder}
import org.apache.avro.io.EncoderFactory
import org.apache.avro.SchemaBuilder
import org.apache.hadoop.fs.Path
import parquet.avro.{AvroParquetReader, AvroParquetWriter}
import scala.util.control.Breaks.break
object HelloAvro {
@hammer
hammer / server_side_datatable.py
Created January 21, 2014 15:58
Implementation of a server-side DataTable (cf. http://datatables.net/release-datatables/examples/data_sources/server_side.html) using Flask, Flask-RESTful, and Psycopg. You should only have to edit source_database, source_table, and source_columns to make it work. Of course you'll probably want to edit the resource name and URL as well.
from string import Template
from distutils.util import strtobool
from flask import Flask, request
from flask.ext.restful import Api, Resource
import psycopg2
# Flask-RESTful Api object
app = Flask(__name__)
@hammer
hammer / platform-input-support-config.yaml
Last active July 8, 2019 20:28
Configuration organized by ES index; note that this will not parse correctly! It's for purely pedagogical purposes.
# [invalid-]evidence-data
evidences:
gs_output_dir: evidence-files
downloads:
- bucket: otar000-evidence_input/CRISPR/json
output_filename: crispr-{suffix}.json.gz
resource: input-file
subset_key:
- target
- id
@hammer
hammer / mrtarget.data.19.06.yml
Last active July 7, 2019 19:46
Configuration organized by ES index
# relation-data
ddr:
evidence-count: 3
score-threshold: 0.1
# association-data
scoring_weights:
crisp: 1
europepmc: 0.2
expression_atlas: 0.2
@hammer
hammer / Dockerfile
Last active February 24, 2018 20:39 — forked from flying-sheep/Dockerfile
scanpy-locale-setup
FROM ubuntu:17.10
ENV LANG C.UTF-8
RUN apt-get update && \
apt-get install -y python3 python3-pip libxml2-dev zlib1g-dev wget git cmake && \
apt-get clean
RUN apt-get install -y python3-numpy # not necessary in scanpy 0.2.9.2/0.2.10
@hammer
hammer / cosmic_sigs.R
Last active January 9, 2017 21:19
Exploring COSMIC signatures
# deconstructSigs signatures.cosmic
library("deconstructSigs")
library("tibble")
signatures.cosmic.tidy <- tibble::rownames_to_column(as.data.frame(t(signatures.cosmic)))
# COSMIC http://cancer.sanger.ac.uk/cancergenome/assets/signatures_probabilities.txt
cosmic.current <- read.delim("http://cancer.sanger.ac.uk/cancergenome/assets/signatures_probabilities.txt", check.names = F)
cosmic.current.clean <- cosmic.current[ , colnames(cosmic.current) != ""]
cosmic.current.tidy <- cosmic.current.clean[,names(cosmic.current.clean) not in c("Substitution Type","Trinucleotide")]
@hammer
hammer / seqs2bed.py
Created January 11, 2016 22:20
Convert DeepSEA training sequences a BED file
import h5py
# HDF5 file with two arrays: 'trainxdata' (samples) and 'traindata' (labels)
INFILE_SAMPLES = ''
INFILE_REFERENCE_FASTA = ''
OUTFILE_FASTA = 'deepsea_train10k.fa'
OUTFILE_BED = 'deepsea_train10k.bed'
def onehot2base(onehot):
if onehot == [1,0,0,0]:
$ bin/hadoop org.apache.hadoop.hbase.PerformanceEvaluation sequentialWrite 4
10/06/11 03:29:15 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version=3.3.1-942149, built on 05/07/2010 17:14 GMT
10/06/11 03:29:15 INFO zookeeper.ZooKeeper: Client environment:host.name=172.28.172.2
10/06/11 03:29:15 INFO zookeeper.ZooKeeper: Client environment:java.version=1.6.0_17
10/06/11 03:29:15 INFO zookeeper.ZooKeeper: Client environment:java.vendor=Apple Inc.
10/06/11 03:29:15 INFO zookeeper.ZooKeeper: Client environment:java.home=/System/Library/Frameworks/JavaVM.framework/Versions/1.6.0/Home
10/06/11 03:29:15 INFO zookeeper.ZooKeeper: Client environment:java.class.path=/Users/hammer/codebox/hadoop-0.20.2+228/bin/../conf:/System/Library/Frameworks/JavaVM.framework/Versions/1.6.0/Home/lib/tools.jar:/Users/hammer/codebox/hadoop-0.20.2+228/bin/..:/Users/hammer/codebox/hadoop-0.20.2+228/bin/../hadoop-0.20.2+228-core.jar:/Users/hammer/codebox/hadoop-0.20.2+228/bin/../lib/commons-cli-1.2.jar:/Users/hammer/codebox/hadoo
$ ./bin/start-hbase.sh
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfTool
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.HBaseConfTool
at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
at java.lang.ClassLoader.loadClass(ClassLoader.java:315)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:330)
at java.lang.ClassLoader.loadClass(ClassLoader.java:250)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:398)
@hammer
hammer / pysoftlayer.py
Created December 18, 2008 09:07
A more usable version of my Python wrapper for the SoftLayer API
import xmlrpclib
"""
A simple Python wrapper for the SoftLayer API as exposed via XML-RPC
"""
# TODO: Make this object-based for services, data types, and methods instead of dictionary-based
# TODO: Read authentication information from a secure location
# TODO: Demonstrate how to call a method that requires a parameter