Grega Kespret gregakespret

## ProbabilityHelpers.scala
package com.celtra.experimentation

import org.apache.commons.math3.distribution.{BetaDistribution,NormalDistribution}
import org.apache.commons.math3.random.Well19937c

object ProbabilityHelpers {
  val PRIOR_ALPHA = 1
  val PRIOR_BETA = 1
  val DEFAULTNSIM = 10000

## sublime-text-macos-context-menu.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                gregakespret
                / sublime-text-macos-context-menu.md
            
            
              Created
              December 30, 2017 05:30
                — forked from idleberg/sublime-text-macos-context-menu.md
            
              
                “Open in Sublime Text” in macOS context-menu
              
          
    Open in Sublime Text


Open Automator
Create a new Service
Set “Service receives selected” to files or folders in any application
Add a Run Shell Script action
Set the script action to /Applications/Sublime\ Text.app/Contents/SharedSupport/bin/subl -n $@
Set “Pass input” to as arguments
Save as Open in Sublime Text


## Celtra Data Engineering Challenge.md

      
              1 file
            
          
              1 fork
            
          
              0 comments
            
          
              0 stars
            
          
                gregakespret
                / Celtra Data Engineering Challenge.md
            
            
              Last active
              September 26, 2018 05:51
            
              
                Celtra Data Engineering Challenge
              
          
    Celtra Data Engineer Challenge

First of all, thank you for taking the time to do this challenge. There are many possible ways to solve this task. Your solution will help us gain insight into how you think, how much you care about technical aspects of software development and deployment, what architectural decisions you make, what standards of quality you adhere to and what tools and technologies you like to use and how you use them. Hopefully, we may be able to learn something from you, as well :)
Description

You are creating a low-latency reporting service that lets you generate adhoc reports. Primary use case for this service is a user-facing dashboard.

The service should support drill-downs and roll-ups

on dimensions: date, campaignId, campaignName, adId, adName and


on metrics: impressions, clicks, interactions, swipes, pinches, touches, uniqueUsers


## gist:9483f93f7e473bccee14e45c17e86d50

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                gregakespret
                / gist:9483f93f7e473bccee14e45c17e86d50
            
            
              Created
              November 22, 2016 16:17
            
              
                Celtra single-page HTML5 app assignment
              
          
    Celtra Programming Assignment

First of all, we wish to thank you for taking the time to do this assignment.
There are many possible ways to develop this rather simple application. Your solution will help us gain insight into how you think, what architectural decisions you make, what standards of quality you adhere to and what tools and technologies you like to use and how you use them. Hopefully, we may be able to learn something from you, as well :)
As you will notice, not every detail is clearly defined. You have the freedom to make your own choices where you see fit. But you can also ask questions, of course: feel free to e-mail the one who gave you the task.
Please e-mail your solution to Gregor within one week. After we have reviewed it, we will invite you over for a chat.

  
## gist:e2bfd4eccaf60c1d9c3d

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                gregakespret
                / gist:e2bfd4eccaf60c1d9c3d
            
            
              Last active
              June 27, 2018 10:09
            
              
                Data Scientist Assignment
              
          
    Celtra Data Scientist Assignment

First of all, thank you for taking the time to do this assignment.
There are many possible ways to solve this data problem. Your solution will help us gain insight into how you think, what tools and technologies you like to use and how you use them. Hopefully, we may be able to learn something from you, as well :)
As you will notice, not every detail is clearly defined. You have the freedom to make your own choices where you see fit. But you can also ask questions, of course.
Please e-mail your solution to the person that gave it to you within the agreed time.

  
## gist:570998fccd6ca6e24ad4
import java.io.File
import java.nio.charset.Charset

import com.tdunning.math.stats.{ArrayDigest, TDigest}
import scala.collection.JavaConversions._ // needed for java Collection -> scala Seq
import scala.io.Source
import com.google.common.io.Files

object Histogram extends App {
  val distribution: TDigest = TDigest.createArrayDigest(35, 1000)

## gist:95e74f28551edc8a90c6
import com.fasterxml.jackson.annotation.JsonSubTypes.Type
import com.fasterxml.jackson.annotation.{JsonTypeName, JsonSubTypes, JsonTypeInfo, JsonProperty}
import com.fasterxml.jackson.databind.{ObjectMapper,DeserializationFeature}
import com.fasterxml.jackson.module.scala.DefaultScalaModule
import com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper

@JsonTypeInfo(
  use = JsonTypeInfo.Id.NAME,
  include = JsonTypeInfo.As.PROPERTY,
  property = "clazz",

## gist:813b540faca678413ad4
14/05/21 21:44:45 ERROR SparkHadoopWriter: Error committing the output of task: attempt_201405212144_0000_m_000000_3432
java.io.IOException: Failed to save output of task: attempt_201405212144_0000_m_000000_3432
        at org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:160)
        at org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:172)
        at org.apache.hadoop.mapred.FileOutputCommitter.commitTask(FileOutputCommitter.java:132)
        at org.apache.hadoop.mapred.SparkHadoopWriter.commit(SparkHadoopWriter.scala:110)
        at org.apache.spark.rdd.PairRDDFunctions.org$apache$spark$rdd$PairRDDFunctions$$writeToFile$1(PairRDDFunctions.scala:731)
        at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$2.apply(PairRDDFunctions.scala:734)
        at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$2.apply(PairRDDFunctions.scala:734)
        at org.apache.spark.scheduler.ResultTask.runTask(Result

## Cyclic reference immutable objects.scala
package org.example

abstract class Person(val name: String)

// cannot be case class, because case classes have all parameters as vals and it wouldn't make sense to lazily instantiate them
class Girl(val name2: String, _boyfriend: => Boy) extends Person(name2) {
    lazy val boyfriend = _boyfriend
}

class Boy(val name2: String, _girlfriend: => Girl) extends Person(name2) {

## gist:7874908
Connected to jdbc:vertica://vertica7.aws.celtra-test.com:5433/aws7 (DirectBatchInsert: false)
13/12/09 15:45:36,704 INFO slf4j.Slf4jEventHandler: Slf4jEventHandler started
13/12/09 15:45:36,888 INFO spark.SparkEnv: Registering BlockManagerMaster
13/12/09 15:45:36,925 INFO storage.DiskBlockManager: Created local directory at /tmp/spark-local-20131209154536-5135
13/12/09 15:45:36,933 INFO storage.MemoryStore: MemoryStore started with capacity 2.2 GB.
13/12/09 15:45:36,969 INFO network.ConnectionManager: Bound socket to port 45383 with id = ConnectionManagerId(ip-10-170-8-11.ec2.internal,45383)
13/12/09 15:45:36,977 INFO storage.BlockManagerMaster: Trying to register BlockManager
13/12/09 15:45:36,988 INFO storage.BlockManagerMasterActor$BlockManagerInfo: Registering block manager ip-10-170-8-11.ec2.internal:45383 with 2.2 GB RAM
13/12/09 15:45:36,989 INFO storage.BlockManagerMaster: Registered BlockManager
13/12/09 15:45:37,078 INFO server.Server: jetty-7.x.y-SNAPSHOT
	package com.celtra.experimentation

	import org.apache.commons.math3.distribution.{BetaDistribution,NormalDistribution}
	import org.apache.commons.math3.random.Well19937c

	object ProbabilityHelpers {
	val PRIOR_ALPHA = 1
	val PRIOR_BETA = 1
	val DEFAULTNSIM = 10000
	import java.io.File
	import java.nio.charset.Charset

	import com.tdunning.math.stats.{ArrayDigest, TDigest}
	import scala.collection.JavaConversions._ // needed for java Collection -> scala Seq
	import scala.io.Source
	import com.google.common.io.Files

	object Histogram extends App {
	val distribution: TDigest = TDigest.createArrayDigest(35, 1000)
	import com.fasterxml.jackson.annotation.JsonSubTypes.Type
	import com.fasterxml.jackson.annotation.{JsonTypeName, JsonSubTypes, JsonTypeInfo, JsonProperty}
	import com.fasterxml.jackson.databind.{ObjectMapper,DeserializationFeature}
	import com.fasterxml.jackson.module.scala.DefaultScalaModule
	import com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper

	@JsonTypeInfo(
	use = JsonTypeInfo.Id.NAME,
	include = JsonTypeInfo.As.PROPERTY,
	property = "clazz",
	14/05/21 21:44:45 ERROR SparkHadoopWriter: Error committing the output of task: attempt_201405212144_0000_m_000000_3432
	java.io.IOException: Failed to save output of task: attempt_201405212144_0000_m_000000_3432
	at org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:160)
	at org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:172)
	at org.apache.hadoop.mapred.FileOutputCommitter.commitTask(FileOutputCommitter.java:132)
	at org.apache.hadoop.mapred.SparkHadoopWriter.commit(SparkHadoopWriter.scala:110)
	at org.apache.spark.rdd.PairRDDFunctions.org$apache$spark$rdd$PairRDDFunctions$$writeToFile$1(PairRDDFunctions.scala:731)
	at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$2.apply(PairRDDFunctions.scala:734)
	at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$2.apply(PairRDDFunctions.scala:734)
	at org.apache.spark.scheduler.ResultTask.runTask(Result
	package org.example

	abstract class Person(val name: String)

	// cannot be case class, because case classes have all parameters as vals and it wouldn't make sense to lazily instantiate them
	class Girl(val name2: String, _boyfriend: => Boy) extends Person(name2) {
	lazy val boyfriend = _boyfriend
	}

	class Boy(val name2: String, _girlfriend: => Girl) extends Person(name2) {
	Connected to jdbc:vertica://vertica7.aws.celtra-test.com:5433/aws7 (DirectBatchInsert: false)
	13/12/09 15:45:36,704 INFO slf4j.Slf4jEventHandler: Slf4jEventHandler started
	13/12/09 15:45:36,888 INFO spark.SparkEnv: Registering BlockManagerMaster
	13/12/09 15:45:36,925 INFO storage.DiskBlockManager: Created local directory at /tmp/spark-local-20131209154536-5135
	13/12/09 15:45:36,933 INFO storage.MemoryStore: MemoryStore started with capacity 2.2 GB.
	13/12/09 15:45:36,969 INFO network.ConnectionManager: Bound socket to port 45383 with id = ConnectionManagerId(ip-10-170-8-11.ec2.internal,45383)
	13/12/09 15:45:36,977 INFO storage.BlockManagerMaster: Trying to register BlockManager
	13/12/09 15:45:36,988 INFO storage.BlockManagerMasterActor$BlockManagerInfo: Registering block manager ip-10-170-8-11.ec2.internal:45383 with 2.2 GB RAM
	13/12/09 15:45:36,989 INFO storage.BlockManagerMaster: Registered BlockManager
	13/12/09 15:45:37,078 INFO server.Server: jetty-7.x.y-SNAPSHOT