Skip to content

Instantly share code, notes, and snippets.

@tillawy
Created March 9, 2016 15:46
Show Gist options
  • Save tillawy/a6da5661d879865b0fb1 to your computer and use it in GitHub Desktop.
Save tillawy/a6da5661d879865b0fb1 to your computer and use it in GitHub Desktop.
Calculate the average of a dataset.
def average(self,dataset):
output = dataset.map(lambda x:(x[1],x[2] ))
output = output.groupByKey()
output = output.map(lambda x:(x[0], len(x[1]),x[1]))
output = output.map(lambda b:(b[0], sum(b[2])/b[1]))
return output
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment