Skip to content

Instantly share code, notes, and snippets.

class AddingParser(val input: ParserInput) extends Parser {
def num = rule {
capture(oneOrMore(CharPredicate.Digit)) ~> (_.toInt)
}
def root = rule {
push(0) ~ zeroOrMore(num ~> ((_: Int) + _)).separatedBy(" ") ~ EOI
}
}
@themodernlife
themodernlife / spark-bug-deprecatedparquetoutputformat-multipleoutputs.md
Created April 29, 2016 14:25
Errors using Spark, Parquet and MultipleOutputs

When using MultipleOutputs with DeprecatedParquetOutputFormat, you see this error:

java.lang.NullPointerException
  at org.apache.hadoop.fs.Path.<init>(Path.java:105)
  at org.apache.hadoop.fs.Path.<init>(Path.java:94)
  at org.apache.parquet.hadoop.mapred.DeprecatedParquetOutputFormat.getDefaultWorkFile(DeprecatedParquetOutputFormat.java:69)
  at org.apache.parquet.hadoop.mapred.DeprecatedParquetOutputFormat.access$100(DeprecatedParquetOutputFormat.java:36)
  at org.apache.parquet.hadoop.mapred.DeprecatedParquetOutputFormat$RecordWriterWrapper.<init>(DeprecatedParquetOutputFormat.java:89)
  at org.apache.parquet.hadoop.mapred.DeprecatedParquetOutputFormat.getRecordWriter(DeprecatedParquetOutputFormat.java:77)
#!/bin/bash
#
# this script will attempt to detect any ephemeral drives on an EC2 node and create a RAID-0 stripe
# mounted at /mnt. It should be run early on the first boot of the system.
#
# Beware, This script is NOT fully idempotent.
#
METADATA_URL_BASE="http://169.254.169.254/2012-01-12"
#!/bin/bash
#
# this script will attempt to detect any ephemeral drives on an EC2 node and create a RAID-0 stripe
# mounted at /mnt. It should be run early on the first boot of the system.
#
# Beware, This script is NOT fully idempotent.
#
METADATA_URL_BASE="http://169.254.169.254/2012-01-12"