Skip to content

Instantly share code, notes, and snippets.

@cesar1091
Created October 2, 2022 03:00
Show Gist options
  • Select an option

  • Save cesar1091/b37e64c9da672880e2a820a77f6e87c2 to your computer and use it in GitHub Desktop.

Select an option

Save cesar1091/b37e64c9da672880e2a820a77f6e87c2 to your computer and use it in GitHub Desktop.
itemsSchema = StructType([StructField("order_item_id", IntegerType(), True),
StructField("order_item_order_id", IntegerType(), True),
StructField("order_item_product_id", IntegerType(), True),
StructField("order_item_quantity", IntegerType(), True),
StructField("order_item_subtotal", FloatType(), True),
StructField("order_item_productprice", FloatType(), True)])
order_items= spark.read.format("csv").option("inferSchema", "true").schema(itemsSchema).load("/public/retail_db/order_items/part-00000")
order_items.write.format("orc").option("compression","uncompressed").save("/user/vagrant/lab1/pregunta9/resultado")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment