Skip to content

Instantly share code, notes, and snippets.

View zsxwing's full-sized avatar
:octocat:

Shixiong Zhu zsxwing

:octocat:
  • Databricks, Inc.
  • San Francisco
View GitHub Profile
/**
* Copyright 2013 Netflix, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
@zsxwing
zsxwing / test.scala
Created August 14, 2014 14:44
Report Foo cannot be serialized.
scala> class Foo { def foo() = Array(1.0) }
defined class Foo
scala> val t = new Foo
t: Foo = $iwC$$iwC$$iwC$$iwC$Foo@5ef6a5b6
scala> val m = t.foo
m: Array[Double] = Array(1.0)
scala> val r1 = sc.parallelize(List(1, 2, 3))
@zsxwing
zsxwing / test2.scala
Last active August 29, 2015 14:05
This example can work.
scala> class Foo { def foo() = Array(1.0) }
defined class Foo
scala> var m: Array[Double] = null
m: Array[Double] = null
scala> {
| val t = new Foo
| m = t.foo
| }
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.function.FlatMapFunction;
import org.apache.spark.api.java.function.Function2;
import org.apache.spark.api.java.function.PairFunction;
import org.apache.spark.streaming.Duration;
import org.apache.spark.streaming.api.java.JavaDStream;
import org.apache.spark.streaming.api.java.JavaPairDStream;
import org.apache.spark.streaming.api.java.JavaStreamingContext;
import org.apache.spark.streaming.scheduler.*;
import scala.Tuple2;
final JavaStreamingContext jssc = new JavaStreamingContext(...);
final Time exitTime = new Time(12345L); // Need to set the correct exit time
jssc.addStreamingListener(new StreamingListener(){
@Override
public void onBatchCompleted(StreamingListenerBatchCompleted batchCompleted) {
if (batchCompleted.batchInfo().batchTime().greaterEq(exitTime)) {
new Thread() {
@Override
public void run() {
jssc.stop(true, true);
@zsxwing
zsxwing / hbase-spark.scala
Created August 11, 2014 08:29
hbase-spark.scala
import java.io.{DataOutputStream, ByteArrayOutputStream}
import java.lang.String
import org.apache.hadoop.hbase.client.Scan
import org.apache.hadoop.hbase.HBaseConfiguration
import org.apache.hadoop.hbase.io.ImmutableBytesWritable
import org.apache.hadoop.hbase.client.Result
import org.apache.hadoop.hbase.mapreduce.TableInputFormat
import org.apache.hadoop.hbase.util.Base64
def convertScanToString(scan: Scan): String = {
@zsxwing
zsxwing / argument.sh
Last active December 15, 2015 21:58
Set the $1 $2 $3, ..., and $# in the current terminal. Avoid to open a new file when you just want to test a small function.
set -- a b c
echo "$1 $2 $3"
@zsxwing
zsxwing / iter_lines.sh
Last active December 15, 2015 21:58
How to iterate lines in a file or generated by a command correctly.
while read line; do
echo $line;
done < <(cat test.txt)
@zsxwing
zsxwing / error_log.sh
Last active December 15, 2015 21:58
A error function to output the message with time.
err() {
echo "[$(date +'%Y-%m-%dT%H:%M:%S%z')]: $@" >&2
}
@zsxwing
zsxwing / long_string.sh
Last active December 15, 2015 21:58
How to create a long string in bash shell
# output "I am an exceptionally long string."
long_string="I am an exceptionally
long string."
echo ${long_string}