Skip to content

Instantly share code, notes, and snippets.

@gcsfred
Created November 15, 2018 12:42
Show Gist options
  • Save gcsfred/ef549404a0b2ac38b81a12372872f906 to your computer and use it in GitHub Desktop.
Save gcsfred/ef549404a0b2ac38b81a12372872f906 to your computer and use it in GitHub Desktop.
pandas_udf fill missing and empty string
from pyspark.sql.functions import pandas_udf
#...
# Use pandas_udf to define a Pandas UDF
@pandas_udf('string')
# Input/output are both a pandas.Series of string
def pandas_not_null(s):
return s.fillna("_NO_₦Ӑ_").replace('', '_NO_ӖӍΡṬΫ_')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment