Skip to content

Instantly share code, notes, and snippets.

@github-louis-fruleux
Created September 6, 2022 13:29
Show Gist options
  • Save github-louis-fruleux/79a5e4e8b56aa96837df485bed469409 to your computer and use it in GitHub Desktop.
Save github-louis-fruleux/79a5e4e8b56aa96837df485bed469409 to your computer and use it in GitHub Desktop.
Spark UDF change
import org.apache.spark.sql.types._
import org.apache.spark.sql.functions._
val f = udf((x: Int) => x, IntegerType)
val df = Seq((None), (Some(1))).toDF("value")
df.show
/* In both Spark versions 2.X and 3.X
+-----+
|value|
+-----+
| null|
| 1|
+-----+
*/
df.select(f($"value").as("incremented")).show
/* Spark3 with option spark.sql.legacy.allowUntypedScalaUDF=true
+-----------+
|incremented|
+-----------+
| 0|
| 1|
+-----------+
*/
/* Spark 2
-----------+
|incremented|
+-----------+
| null|
| 1|
+-----------+
*/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment