Skip to content

Instantly share code, notes, and snippets.

@1ambda
Created December 23, 2021 23:21
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save 1ambda/1f0874677ca3592ec404508b6b6cd1d0 to your computer and use it in GitHub Desktop.
Save 1ambda/1f0874677ca3592ec404508b6b6cd1d0 to your computer and use it in GitHub Desktop.
# dfListingParquet.rdd.getNumPartitions()
197
# dfListingParquet.select(spark_partition_id().alias("partitionId")).groupBy("partitionId").count().show()
+-----------+-----+
|partitionId|count|
+-----------+-----+
| 148| 21|
| 31| 29|
| 85| 24|
| 137| 21|
| 65| 24|
| 53| 24|
| 133| 21|
| 78| 23|
| 108| 24|
| 155| 24|
| 34| 29|
| 193| 22|
| 101| 25|
| 115| 22|
| 126| 23|
| 81| 22|
| 28| 30|
| 183| 25|
| 76| 22|
| 26| 30|
+-----------+-----+
only showing top 20 rows
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment