Skip to content

Instantly share code, notes, and snippets.

@dgadiraju
Created November 23, 2017 11:33
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dgadiraju/1882569f6a425c80477af3651010dba3 to your computer and use it in GitHub Desktop.
Save dgadiraju/1882569f6a425c80477af3651010dba3 to your computer and use it in GitHub Desktop.
// Row level transformations using map
val orders = sc.textFile("/public/retail_db/orders")
// 21,2013-07-25 00:00:00.0,11599,CLOSED -> 20130725 as Int
val str = orders.first
str.split(",")(1).substring(0, 10).replace("-", "").toInt
val orderDates = orders.map((str: String) => {
str.split(",")(1).substring(0, 10).replace("-", "").toInt
})
val ordersPairedRDD = orders.map(order => {
val o = order.split(",")
(o(0).toInt, o(1).substring(0, 10).replace("-", "").toInt)
})
val orderItems = sc.textFile("/public/retail_db/order_items")
val orderItemsPairedRDD = orderItems.map(orderItem => {
(orderItem.split(",")(1).toInt, orderItem)
})
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment