a new messaging-based log aggregator
a distributed messaging system
horizontally scalable messaging system.
Memory Mapped Files
Kernel Space processing
# See https://docs.oracle.com/javase/8/docs/technotes/tools/windows/java.html | |
# See https://docs.oracle.com/javase/8/docs/technotes/guides/vm/performance-enhancements-7.html | |
# See https://docs.oracle.com/javase/8/embedded/develop-apps-platforms/codecache.htm | |
# See http://normanmaurer.me/blog_in_progress/2013/11/07/Inline-all-the-Things/ | |
# See http://stas-blogspot.blogspot.com.br/2011/07/most-complete-list-of-xx-options-for.html | |
# -XX:+LogCompilation | |
# -XX:+PrintInlining | |
-Dfile.encoding=UTF-8 |
# Enable Graphite | |
*.sink.graphite.class=org.apache.spark.metrics.sink.GraphiteSink | |
*.sink.graphite.host=<graphite host> | |
*.sink.graphite.port=<graphite port> | |
*.sink.graphite.period=10 | |
# Enable jvm source for instance master, worker, driver and executor | |
master.source.jvm.class=org.apache.spark.metrics.source.JvmSource | |
worker.source.jvm.class=org.apache.spark.metrics.source.JvmSource | |
driver.source.jvm.class=org.apache.spark.metrics.source.JvmSource |
// Problem: creating a Spark UDF that take extra parameter at invocation time. | |
// Solution: using currying | |
// http://stackoverflow.com/questions/35546576/how-can-i-pass-extra-parameters-to-udfs-in-sparksql | |
// We want to create hideTabooValues, a Spark UDF that set to -1 fields that contains any of given taboo values. | |
// E.g. forbiddenValues = [1, 2, 3] | |
// dataframe = [1, 2, 3, 4, 5, 6] | |
// dataframe.select(hideTabooValues(forbiddenValues)) :> [-1, -1, -1, 4, 5, 6] | |
// | |
// Implementing this in Spark, we find two major issues: |