Skip to content

Instantly share code, notes, and snippets.

@AdroitAnandAI
Created June 12, 2021 09:14
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save AdroitAnandAI/6c6ab8084599314b7e56835d946ef1cd to your computer and use it in GitHub Desktop.
Save AdroitAnandAI/6c6ab8084599314b7e56835d946ef1cd to your computer and use it in GitHub Desktop.
Reduce Operation Example
# Simple ReduceByKey example in python sourced from:
# https://backtobazics.com/big-data/spark/apache-spark-reducebykey-example/
# creating PairRDD x with key value pairs
x = sc.parallelize([("a", 1), ("b", 1), ("a", 1), ("a", 1),
("b", 1), ("b", 1), ("b", 1), ("b", 1)], 3)
# Applying reduceByKey operation on x
y = x.reduceByKey(lambda accum, n: accum + n)
print(y.collect())
# [('b', 5), ('a', 3)]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment