Skip to content

Instantly share code, notes, and snippets.

@karpanGit
Created May 3, 2022 19:18
Show Gist options
  • Save karpanGit/f5afd645da109fd7ce7ab523dddafe21 to your computer and use it in GitHub Desktop.
Save karpanGit/f5afd645da109fd7ce7ab523dddafe21 to your computer and use it in GitHub Desktop.
pyspark, local vs global views
spark = (
SparkSession.builder
.appName('learn')
# .config('spark.sql.shuffle.partitions', 10)
# .config('spark.default.parallelism', 10)
# .config('spark.executor.memory', '1g')
# .config('spark.driver.memory', '1g')
# .config('spark.executor.instances', 1)
#.config('spark.executor.cores', 2)
.getOrCreate()
)
spark2 = spark.newSession()
# experiment with temp local and global views
df = spark.createDataFrame([[1,2], [2,4], [3,9]], ['one', 'square'])
df2 = spark.createDataFrame([[10,20], [20,40], [30,90]], ['one', 'square'])
df.createOrReplaceTempView('df_spark_local')
df2.createOrReplaceGlobalTempView('df2_spark_global')
spark.catalog.listTables()
spark2.catalog.listTables()
# both work
spark.sql("select * from global_temp.df2_spark_global").show()
spark2.sql("select * from global_temp.df2_spark_global").show()
# works
spark.sql("select * from df_spark_local").show()
# does not work
spark2.sql("select * from df_spark_local").show()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment