Skip to content

Instantly share code, notes, and snippets.

View gbraccialli's full-sized avatar

Gui Braccialli gbraccialli

View GitHub Profile
@gbraccialli
gbraccialli / create_random_data.scala
Last active September 20, 2018 13:51
spark_scala_python_udf_battle
//scala create datasets
def randomStr(size: Int): String = {
import scala.util.Random
return Random.alphanumeric.take(size).mkString("")
}
val udfRandomStr = udf(randomStr _)
val dfRnd = (1 to 30000).toDF.repartition(3000)
val dfRnd2 = (1 to 10).toDF.withColumnRenamed("value", "value2")