Skip to content

Instantly share code, notes, and snippets.

@sudar
Last active December 22, 2015 21:39
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save sudar/6534857 to your computer and use it in GitHub Desktop.
Save sudar/6534857 to your computer and use it in GitHub Desktop.
Writing Pig UDF function using Python. Details at http://sudarmuthu.com/blog/writing-pig-udf-functions-using-python
def get_length(data):
return len(data)
@outputSchema("num:long")
def get_length(data):
return len(data)
REGISTER 'udf.py' USING jython as pyudf
A = LOAD 'data.txt' USING PigStorage();
B = FOREACH A GENERATE $0, pyudf.get_length($0);
DUMP B;
@outputSchema("num:long")
def get_length(data):
str_data = ''.join([chr(x) for x in data])
return len(str_data)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment