Created
November 13, 2017 01:04
-
-
Save dgadiraju/8ede0da8d49c8cfa1066cb82df48860e to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/* | |
spark-shell --master yarn \ | |
--conf spark.ui.port=12456 \ | |
--num-executors 10 \ | |
--executor-memory 3G \ | |
--executor-cores 2 \ | |
--packages com.databricks:spark-avro_2.10:2.0.1 | |
*/ | |
val lines = sc.textFile("/public/randomtextwriter") | |
val words = lines.flatMap(line => line.split(" ")) | |
val tuples = words.map(word => (word, 1)) | |
val wordCount = tuples.reduceByKey((total, value) => total + value, 8) | |
val wordCountDF = wordCount.toDF("word", "count") | |
import com.databricks.spark.avro._ | |
wordCountDF.write.avro("/user/dgadiraju/solutions/solution05/wordcount") |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment