Skip to content

Instantly share code, notes, and snippets.

View jarutis's full-sized avatar

Jonas Jarutis jarutis

View GitHub Profile
* PhD
Functional analysis ++
Statistics ++
Econometrics
Writing research papers
Teaching
* Data Analysis
Data wrangling ++++++++
Regresion analysis ++++
@jarutis
jarutis / udadist-tests.cc
Last active August 29, 2015 14:08
Failed Impala UDAF
#include <iostream>
#include <math.h>
#include <impala_udf/uda-test-harness.h>
#include "uda-dist.h"
using namespace impala;
using namespace impala_udf;
using namespace std;
@jarutis
jarutis / uda-dist-test.cc
Created November 10, 2014 06:09
First successful impala UDA!
#include <iostream>
#include <math.h>
#include <impala_udf/uda-test-harness.h>
#include "uda-dist.h"
using namespace impala;
using namespace impala_udf;
using namespace std;
{
"metadata": {
"kernelspec": {
"codemirror_mode": {
"name": "python",
"singleOperators": {},
"version": 2
},
"display_name": "IPython (Python 2)",
"language": "python",
install.packages('rJava', type='source')
# Setup RMysql to use percona server
Sys.setenv(PKG_CPPFLAGS = "-I/usr/local/Cellar/percona-server/5.6.22-72.0/include/mysql/")
Sys.setenv(PKG_LIBS = "-L/usr/local/Cellar/percona-server/5.6.22-72.0/lib -lperconaserverclient")
install.packages('RMySQL', type='source')
# Setup connection to hadoop
install.packages('RImpala')
install.packages('RJDBC')
import com.twitter.scalding.typed._
case class NextEvent(id: String, time: Long)
val events1 = List(
NextEvent("user1", 1421280000498L),
NextEvent("user1", 1421280000769L),
NextEvent("user2", 1421280000819L)
)
./bin/spark-shell --master yarn-client --num-executors 50 --driver-memory 7g --executor-memory 7g --executor-cores 1 --jars /home/jjarutis/ini/joda-time-2.7/joda-time-2.7.jar,/home/jjarutis/ini/joda-time-2.7/joda-convert-1.7.jar,/home/jjarutis/math_2.10-0.1.0.jar
import org.joda.time.{LocalDate, Days}
import org.apache.spark.sql.DataFrame
import org.apache.spark.sql.Row
import org.apache.spark.rdd.RDD
import vinted.math.{RolingSum, TotalSum}
// Helpers
val context = new org.apache.spark.sql.SQLContext(sc)
@jarutis
jarutis / pres.org
Created May 10, 2015 20:47
prezentacija

FDA taikymo galimybės Vinted platformoje

from 'batch_views.items i'
period_field 'SUBSTRING(i.created_at, 1, 10)'
period_type Report::DatePeriod
dimensions portal: 'i.country',
platform: 'i.app_id',
price_group: 'i.price_groups',
detailed_price_group: 'i.price_groups_detailed',
paid_organic: 'u.paid_organic',
@jarutis
jarutis / Word2wec.ipynb
Created July 10, 2015 22:59
Word2wec on spark error
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.