Skip to content

Instantly share code, notes, and snippets.

@rchamarthi
Created July 16, 2017 20:33
Show Gist options
  • Save rchamarthi/b0f93e6e701278942e9f85678c09fb95 to your computer and use it in GitHub Desktop.
Save rchamarthi/b0f93e6e701278942e9f85678c09fb95 to your computer and use it in GitHub Desktop.
Spark - calling a udf from spark sql
scala> import org.apache.spark.sql.functions.{input_file_name, udf}
import org.apache.spark.sql.functions.{input_file_name, udf}
scala> def extract_file_name(path: String): String =
| path.split("/").last
extract_file_name: (path: String)String
scala> spark.sqlContext.udf.register("extract_file_name", extract_file_name _);
res4: org.apache.spark.sql.expressions.UserDefinedFunction = UserDefinedFunction(<function1>,StringType,Some(List(StringType)))
scala> spark.sql("select extract_file_name('tmp/test.txt') file_name").show()
+---------+
|file_name|
+---------+
| test.txt|
+---------+
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment