Skip to content

Instantly share code, notes, and snippets.

@afranzi
Created January 31, 2019 15:23
Show Gist options
  • Save afranzi/b2f7279af3f0f9790e795f67322a7ec2 to your computer and use it in GitHub Desktop.
Save afranzi/b2f7279af3f0f9790e795f67322a7ec2 to your computer and use it in GitHub Desktop.
@udf(returnType=ArrayType(StringType()))
def to_upper_list(s):
return [i.upper() for i in s]
# Case 1 - UDF annotation
to_upper_list(['potato', 'carrot', 'tomato'])
"""
TypeError: Invalid argument, not a string or column: ['potato', 'carrot', 'tomato'] of type <class 'list'>.
For column literals, use 'lit', 'array', 'struct' or 'create_map' function
"""
# Case 2 - UDF annotation and calling the method using the literal lit() method
to_upper_list(lit(['potato', 'carrot', 'tomato']))
"""
col = ['potato', 'carrot', 'tomato']
def _(col):
sc = SparkContext._active_spark_context
> jc = getattr(sc._jvm.functions, name)(col._jc if isinstance(col, Column) else col)
E AttributeError: 'NoneType' object has no attribute '_jvm'
"""
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment