Skip to content

Instantly share code, notes, and snippets.

View geoHeil's full-sized avatar

geoHeil geoHeil

View GitHub Profile
@geoHeil
geoHeil / gist:f52bb118303157cafd77e0c49db3de71
Created April 18, 2016 10:42
Install sbt and scala on ubuntu phusion base image
FROM phusion/baseimage
MAINTAINER name
# Install basic packages
RUN \
apt-get update; apt-get upgrade -y -qq; \
apt-get install -y -qq wget; \
apt-get install -y -qq curl; \
apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
@geoHeil
geoHeil / gist:2feb74f303b0cd97cb7a42918efc90c3
Created May 1, 2016 18:52
spark-jobserver 2.11 exception
LOG_DIR empty; logging will go to /tmp/job-server
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; support was removed in 8.0
[2016-05-01 18:35:28,375] INFO spark.jobserver.JobServer$ [] [] - Starting JobServer with config {
# system properties
"app" : {
# system properties
"name" : "spark.jobserver.JobServer"
},
# merge of /app/docker.conf: 39,application.conf: 101
# universal context configuration. These settings can be overridden, see README.md
@geoHeil
geoHeil / spark exception
Created May 11, 2016 11:12
sbt run spark exception
16/05/11 13:01:28 INFO DAGScheduler: ResultStage 29 (map at DrilldownArtist.scala:337) finished in 2,172 s
16/05/11 13:01:28 INFO DAGScheduler: Job 13 finished: map at DrilldownArtist.scala:337, took 38,416625 s
[error] (run-main-0) org.apache.spark.SparkException: Job aborted due to stage failure: Task 697 in stage 1.0 failed 1 times, most recent failure: Lost task 697.0 in stage 1.0 (TID 705, localhost): ExecutorLostFailure (executor driver exited caused by one of the running tasks) Reason: Executor heartbeat timed out after 156451 ms
[error] Driver stacktrace:
16/05/11 13:01:28 INFO MapOutputTrackerMaster: Size of output statuses for shuffle 12 is 14186 bytes
org.apache.spark.SparkException: Job aborted due to stage failure: Task 697 in stage 1.0 failed 1 times, most recent failure: Lost task 697.0 in stage 1.0 (TID 705, localhost): ExecutorLostFailure (executor driver exited caused by one of the running tasks) Reason: Executor heartbeat timed out after 156451 ms
Driver stacktrace:
at org.apache.sp
@geoHeil
geoHeil / gist:943a18d43279762ad4cdfe9aa2e40770
Last active May 21, 2016 09:32
Very strange scala error with future. Description http://stackoverflow.com/questions/37352685/scala-future-null-pointer-match-error ; if run via sbt console and executed manually it works fine if run via sbt run I get a null pointer exception
package jobs.outlier
import java.sql.Timestamp
import java.text.SimpleDateFormat
import play.api.libs.json._
import play.api.libs.ws.WSResponse
import play.api.libs.ws.ning.NingWSClient
import scala.concurrent.duration._
@geoHeil
geoHeil / someAwesomeSchema.xsd
Last active June 2, 2016 16:48
someAwesomeSchema.xsd
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<xsd:schema version="1.0" xmlns:xsd="http://www.w3.org/2001/XMLSchema" vc:minVersion="1.1"
targetNamespace="http://some/awesome/schema"
xmlns:vc="http://www.w3.org/2007/XMLSchema-versioning">
<xsd:element name="myElem" type="myElem"/>
<xsd:complexType name="myElem">
<xsd:sequence>
<xsd:element name="geburtsdatum" type="xsd:date" minOccurs="0"/>
</xsd:sequence>
</xsd:complexType>
@geoHeil
geoHeil / gist:48cd94bf4d748fd8fd6d25de4c272157
Last active July 5, 2016 10:23
fail to execute test --> docker run -it --privileged --rm --net=host -v /dev/shm:/dev/shm thisImage:latest bash
FROM jenkinsci/jenkins:2.11
MAINTAINER geoHeil
USER root
RUN \
apt-get update; apt-get upgrade -y -qq; \
apt-get install -y -qq wget; \
apt-get install -y -qq git; \
apt-get install -y -qq tar; \
@geoHeil
geoHeil / Dockerfile
Created October 16, 2016 16:04
Prediction.IO v0.10.0-incubating
FROM openjdk:8-jdk
MAINTAINER geoHeil
# Environment variables
ENV PIO_VERSION 0.10.0-incubating
ENV SPARK_VERSION 1.6.2
ENV ELASTICSEARCH_VERSION 1.7.5
ENV HBASE_VERSION 1.0.3
# Base paths
package foo
import java.sql.Date
import org.apache.log4j.{Level, Logger}
import org.apache.spark.SparkConf
import org.apache.spark.sql.expressions.WindowSpec
import org.apache.spark.sql.functions._
import org.apache.spark.sql.{Column, SparkSession}
@geoHeil
geoHeil / minimalExample.scala
Created November 22, 2016 07:06
spark memory problem
package at.ac.tuwien.thesis.problem
import org.apache.log4j.{Level, Logger}
import org.apache.spark.SparkConf
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.functions._
case class FooBar(city: String, postcode: String)
object Foo extends App {
@geoHeil
geoHeil / nearestHoliday.scala
Last active November 24, 2016 08:05
find nearest holiday +- as separate columns
import java.sql.Date
import org.apache.log4j.{Level, Logger}
import org.apache.spark.SparkConf
import org.apache.spark.ml.feature.VectorAssembler
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.functions._
object Foo extends App {