Skip to content

Instantly share code, notes, and snippets.

@tovbinm
Last active June 29, 2017 21:33
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tovbinm/91d43b28ef8b9900184d4916a6e1393e to your computer and use it in GitHub Desktop.
Save tovbinm/91d43b28ef8b9900184d4916a6e1393e to your computer and use it in GitHub Desktop.
Codegen dies (Spark 2.0.2 and 2.1.1) - no udf nesting
import spark.implicits._
import org.apache.spark.sql.functions.udf
import org.apache.spark.sql._
import org.apache.spark.sql.functions._
import org.apache.spark.sql.execution.debug._
val u = udf((a: Int) => a)
val df = spark.sparkContext.parallelize(Seq(0)).toDF("0")
val res = (1 until 20).foldLeft(df) { case (d, i) =>
val inputs = d.columns.toSeq.takeRight(1).map(col(_))
d.select(col("*"), u(inputs: _*).as(i.toString))
}
res.debugCodegen()
res.debug()
res.show(false)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment