Skip to content

Instantly share code, notes, and snippets.

@haitaoyao
haitaoyao / start.erl
Created August 23, 2012 02:25
erlang start application and dependencies
start(App) ->
start_ok(App, application:start(App, permanent)).
start_ok(_App, ok) ->
ok;
start_ok(_App, {error, {already_started, _App}}) ->
ok;
start_ok(App, {error, {not_started, Dep}}) ->
ok = start(Dep),
start(App);
@haitaoyao
haitaoyao / erl_release
Created August 25, 2012 08:13
get the erlang otp release from shell
erl -eval 'erlang:display(erlang:system_info(otp_release)), halt().' -noshell
@haitaoyao
haitaoyao / tools_pom
Created January 19, 2013 08:19
hold tools.jar file location with diffent profile constants
<profiles>
<profile>
<id>windows_profile</id>
<activation>
<os>
<family>Windows</family>
</os>
</activation>
<properties>
@haitaoyao
haitaoyao / ThreadPoolExecutorTest
Created March 14, 2013 03:33
测试SynchronousQueue和ArrayBlockingQueue情况下, ThreadPoolExecutor的不同表现.
import static org.junit.Assert.assertEquals;
import java.util.concurrent.ArrayBlockingQueue;
import java.util.concurrent.Semaphore;
import java.util.concurrent.SynchronousQueue;
import java.util.concurrent.ThreadPoolExecutor;
import java.util.concurrent.TimeUnit;
import org.junit.Test;

Latency numbers every programmer should know

L1 cache reference ......................... 0.5 ns
Branch mispredict ............................ 5 ns
L2 cache reference ........................... 7 ns
Mutex lock/unlock ........................... 25 ns
Main memory reference ...................... 100 ns             
Compress 1K bytes with Zippy ............. 3,000 ns  =   3 µs
Send 2K bytes over 1 Gbps network ....... 20,000 ns  =  20 µs
SSD random read ........................ 150,000 ns  = 150 µs

Read 1 MB sequentially from memory ..... 250,000 ns = 250 µs

@haitaoyao
haitaoyao / HTTPTimeoutSuite.scala
Last active December 29, 2015 10:18
add read time out for HTTP request to avoid spark http broadcast hang
import org.scalatest.{Assertions, FunSuite}
import org.eclipse.jetty.server.handler._
import org.eclipse.jetty.server._
import org.eclipse.jetty.server.bio.SocketConnector
import javax.servlet.http.{HttpServletResponse, HttpServletRequest}
import java.util.concurrent.{Executors, TimeUnit}
import java.net.{URLConnection, URL}
import org.apache.commons.io.IOUtils
import java.io.ByteArrayOutputStream
@haitaoyao
haitaoyao / install_sublime_sql_format
Created August 5, 2014 02:20
install sublime Text 3 FormatSQL plugin
You will want to add https://github.com/freewizard/SublimeFormatSQL
And then you can install the sublimeformatsql package.
For a detailed walkthrough of adding a repository, see http://www.macdrifter.com/2012/08/insta ... ithub.html
use command+k && command+s to format the selected code
@haitaoyao
haitaoyao / gist:f5984eee85f3724bed50
Last active August 29, 2015 14:23
Spark Summit 2015 前夜

今年第二次参加湾区的Spark Summit. 现在家倒时差, 没事儿写两句

去年回顾

去年参加 Spark Summit 最关注的是Intel 的 StreamSQL 和 DataBricks Cloud

简介

  • Intel出品
  • 直接写SQL, 简化实时计算开发
@haitaoyao
haitaoyao / gist:654338075c16cd502b13
Last active August 29, 2015 14:23
Data Pipeline Scheduler [0]

Data Pipeline Scheduler [0]

  • Spark Summit 2015 前夜倒时差, 继续扯淡

美好的开始

故事的开始总是美好的, 刚启动的计算任务一般都是这么简单的一个德行:

  1. 获取日志数据/DB数据
  2. 做一些简单的ETL
  3. 计算报表
@haitaoyao
haitaoyao / gist:0862e1ce6060080afafb
Last active August 29, 2015 14:23
Spark Summit 2015 - Day 1

Spark Summit 2015 Day 1

今天是spark summit 2015 第一天, 总体感觉: 人山人海, Data遍地(人多, 大家都在聊data这个data那个). 具体的schedule 见这里https://spark-summit.org/2015/schedule/, 我主要听Developer Track. 记录一些见闻

上午: 广告和广告, 官方和赞助商

这种会议开场基本上都是广告, 最感兴趣的是databricks发布的cloud产品和timeful的的talkA Tale of a Data-Driven Culture

databricks 亮相