Skip to content

Instantly share code, notes, and snippets.

View joey's full-sized avatar

Joey Echeverria joey

View GitHub Profile
@joey
joey / spark-task.log
Created October 11, 2016 16:50
ClassNotFound when trying to deserialize checkpoint state
stateMap (org.apache.spark.streaming.rdd.MapWithStateRDDRecord)
at com.esotericsoftware.kryo.serializers.JavaSerializer.read(JavaSerializer.java:42)
at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:648)
at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605)
at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
at org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:228)
at org.apache.spark.serializer.DeserializationStream$$anon$1.getNext(Serializer.scala:181)
at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
at org.apache.spark.streaming.rdd.MapWithStateRDD.compute(MapWithStateRDD.scala:153)
@joey
joey / spark-task.log
Created October 10, 2016 18:27
Serialization error with mapWithState
Serialization trace:
stateMap (org.apache.spark.streaming.rdd.MapWithStateRDDRecord)
at com.esotericsoftware.kryo.serializers.JavaSerializer.write(JavaSerializer.java:34)
at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:501)
at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.write(FieldSerializer.java:564)
at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:213)
at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:568)
at org.apache.spark.serializer.KryoSerializationStream.writeObject(KryoSerializer.scala:194)
at org.apache.spark.serializer.SerializationStream.writeAll(Serializer.scala:153)
at org.apache.spark.rdd.ReliableCheckpointRDD$$anonfun$writePartitionToCheckpointFile$2.apply(ReliableCheckpointRDD.scala:182)
@joey
joey / gist:cb4bba71b02f40107c6c
Created April 7, 2015 17:51
Kafka consumer behavior with a rebalance
Consumer C1 reads offsets 0-10 from partition 1
Consumer C2 reads offsets 0-10 from partition 2
C2 fails
Partition 2 is reasigned to C1
C1 reads offsets 0-10 from partition 2
C2 rejoins
Partition 2 is rasigned to C2
C2 reads offsets 0-10 from partition 2
C1 commits, creating a file with offsets 0-10 from partition 1 and 0-10 from partition 2
C2 commits, creating a file with offsets 0-10 from partition 2 <---- This introduces the duplicates
@joey
joey / kite-test.pig
Created August 26, 2014 14:42
Read a Kite dataset with Pig
REGISTER /home/cloudera/.m2/repository/com/fasterxml/jackson/core/jackson-annotations/2.3.0/jackson-annotations-2.3.0.jar
REGISTER /home/cloudera/.m2/repository/com/fasterxml/jackson/core/jackson-core/2.3.1/jackson-core-2.3.1.jar
REGISTER /home/cloudera/.m2/repository/com/fasterxml/jackson/core/jackson-databind/2.3.1/jackson-databind-2.3.1.jar
REGISTER /home/cloudera/.m2/repository/org/kitesdk/kite-data-core/0.16.1-SNAPSHOT/kite-data-core-0.16.1-SNAPSHOT.jar
REGISTER /home/cloudera/.m2/repository/org/kitesdk/kite-data-hbase/0.16.1-SNAPSHOT/kite-data-hbase-0.16.1-SNAPSHOT.jar
REGISTER /home/cloudera/.m2/repository/org/kitesdk/kite-data-hcatalog/0.16.1-SNAPSHOT/kite-data-hcatalog-0.16.1-SNAPSHOT.jar
REGISTER /home/cloudera/.m2/repository/org/kitesdk/kite-data-mapreduce/0.16.1-SNAPSHOT/kite-data-mapreduce-0.16.1-SNAPSHOT.jar
REGISTER /home/cloudera/.m2/repository/org/kitesdk/kite-data-pig/0.16.1-SNAPSHOT/kite-data-pig-0.16.1-SNAPSHOT.jar
REGISTER /home/cloudera/.m2/repository/org/kitesdk/kite-hadoop-compatibilit
match_names function took 94.869 ms
['a', 'ai', 'an', 'as', 'ast', 'b', 'c', 'd', 'di', 'dit', 'e', 'er', 'f', 'fu', 'g', 'h', 'i', 'id', 'ie', 'it', 'k', 'l', 'la', 'las', 'last', 'li', 'liquid', 'liquidity', 'm', 'mo', 'mor', 'morph', 'n', 'o', 'om', 'or', 'p', 'physic', 'physicomorph', 'q', 'quid', 'r', 's', 'si', 'sic', 'st', 'stanza', 'stanzaic', 'sub', 'subfusk', 't', 'ta', 'tan', 'ti', 'tie', 'tier', 'u', 'ug', 'us', 'y', 'z', 'za', 'zugtierlast']
match_names2 function took 36.213 ms
['a', 'ai', 'an', 'as', 'ast', 'b', 'c', 'd', 'di', 'dit', 'e', 'er', 'f', 'fu', 'g', 'h', 'i', 'id', 'ie', 'it', 'k', 'l', 'la', 'las', 'last', 'li', 'liquid', 'liquidity', 'm', 'mo', 'mor', 'morph', 'n', 'o', 'om', 'or', 'p', 'physic', 'physicomorph', 'q', 'quid', 'r', 's', 'si', 'sic', 'st', 'stanza', 'stanzaic', 'sub', 'subfusk', 't', 'ta', 'tan', 'ti', 'tie', 'tier', 'u', 'ug', 'us', 'y', 'z', 'za', 'zugtierlast']
#!/usr/bin/env python
import time
def timing(f):
def wrap(*args):
time1 = time.time()
ret = f(*args)
time2 = time.time()
print '%s function took %0.3f ms' % (f.func_name, (time2-time1)*1000.0)
return ret
Running org.apache.accumulo.start.classloader.vfs.providers.VfsClassLoaderTest
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 8.763 sec <<< FAILURE! - in org.apache.accumulo.start.classloader.vfs.providers.VfsClassLoaderTest
org.apache.accumulo.start.classloader.vfs.providers.VfsClassLoaderTest Time elapsed: 8.763 sec <<< ERROR!
java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.MiniDFSCluster.getFileSystem()Lorg/apache/hadoop/hdfs/DistributedFileSystem;
at org.apache.accumulo.test.AccumuloDFSBase.miniDfsClusterSetup(AccumuloDFSBase.java:72)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)