Skip to content

Instantly share code, notes, and snippets.

View andreatbonanno's full-sized avatar

Andrea Tommaso Bonanno andreatbonanno

View GitHub Profile
### Keybase proof
I hereby claim:
* I am andreatbonanno on github.
* I am andreatbonanno (https://keybase.io/andreatbonanno) on keybase.
* I have a public key ASDRv-rtemwDfZNawiKL284ghOZwv7mFuvEny_kClxtO5go
To claim this, I am signing this object:
@andreatbonanno
andreatbonanno / CreateHiveTableWithPartitions.scala
Last active November 21, 2017 19:22
Spark: create hive external table with partitions (from partitioned parquet file in hdfs)
import org.apache.hadoop.fs.{FileSystem, Path}
import org.apache.spark.sql.SparkSession
def listHdpFiles(filePath: String, excludeFilesFrom: String = ""): Array[String] = {
FileSystem
.get(sc.hadoopConfiguration)
.listStatus(new Path(filePath))
.map(fileStatus => fileStatus.getPath.toString)
.filter(filePath => filePath > excludeFilesFrom)
}