Skip to content

Instantly share code, notes, and snippets.

View balazsreho's full-sized avatar

Balazs Reho balazsreho

  • Budapest, Hungary
View GitHub Profile
@zoltanctoth
zoltanctoth / pyspark-udf.py
Last active July 15, 2023 13:23
Writing an UDF for withColumn in PySpark
from pyspark.sql.types import StringType
from pyspark.sql.functions import udf
maturity_udf = udf(lambda age: "adult" if age >=18 else "child", StringType())
df = spark.createDataFrame([{'name': 'Alice', 'age': 1}])
df.withColumn("maturity", maturity_udf(df.age))
df.show()