This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
hive -e "drop table if exists csv_dump; | |
create table csv_dump ROW FORMAT DELIMITED | |
FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' | |
LOCATION '/temp/storage/path' as | |
select * from my_data_table;" | |
hadoop fs -getmerge /temp/storage/path/ /local/path/my.csv |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
val mydataframe = ... //put some data in your dataframe, friend | |
mydataframe | |
.write | |
.option("orc.compress", "snappy") | |
.mode(SaveMode.Append) | |
.orc("/this/is/an/hdfs/directory/") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
val mydataframe = ... //put some data in your dataframe, friend | |
mydataframe | |
.write | |
.partitionBy("year", "month", "day", "hour") | |
.option("orc.compress", "snappy") | |
.mode(SaveMode.Append) | |
.orc("/this/is/another/hdfs/directory") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
val mydstream = ... // these usually come from Spark Streaming apps | |
// they basically contain a chain of RDDs that you can convert to DFs | |
mydstream.foreachRDD(rdd => { | |
hiveContext.createDataFrame(rdd) | |
.write | |
.option("orc.compress", "snappy") | |
.mode(SaveMode.Append) | |
.orc("/this/is/an/hdfs/directory/too") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// import this guy | |
import org.apache.spark.sql.hive.HiveContext | |
// this should look familiar | |
val conf = new SparkConf() | |
val sc = new SparkContext(conf) | |
// setup this fella | |
val hiveContext = new HiveContext(sc) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
CREATE TABLE my_database.my_table | |
( | |
column_1 string, | |
column_2 int, | |
column_3 double | |
) | |
STORED AS ORC | |
TBLPROPERTIES('ORC.COMPRESS'='SNAPPY'); -- ensure SNAPPY is uppercase, lowercase triggers a nasty bug in Hive (fixed in later versions) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
CREATE TABLE my_database.my_table | |
( | |
column_1 string, | |
column_2 int, | |
column_3 double | |
) | |
PARTITIONED BY | |
( | |
year int, | |
month smallint, |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
CREATE TABLE my_database.my_table | |
( | |
column_1 string, | |
column_2 int, | |
column_3 double | |
) | |
PARTITIONED BY | |
( | |
year int, | |
month smallint, |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ANALYZE TABLE my_database.my_table compute statistics; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ANALYZE TABLE my_database.my_table PARTITION (YEAR=2017, MONTH=11, DAY=30) compute statistics; |
OlderNewer