Skip to content

Instantly share code, notes, and snippets.

@lukleh
Last active July 29, 2019 16:23
Show Gist options
  • Save lukleh/13456e9a31def32f1b1eb07d659347e1 to your computer and use it in GitHub Desktop.
Save lukleh/13456e9a31def32f1b1eb07d659347e1 to your computer and use it in GitHub Desktop.
create sample spark dataframe
import pyspark.sql.functions as F
df = spark.range(1, 1000000, numPartitions=2000)
df = df.withColumn('n1', (F.rand() * 100000).cast('integer')).withColumn('n2', (F.rand() * 100000).cast('integer'))
df.createOrReplaceTempView('tempsampledata')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment