Skip to content

Instantly share code, notes, and snippets.

@tushar-rishav
Created September 30, 2019 13:23
Show Gist options
  • Save tushar-rishav/dc011d8b6c541eae58c471e150c4a968 to your computer and use it in GitHub Desktop.
Save tushar-rishav/dc011d8b6c541eae58c471e150c4a968 to your computer and use it in GitHub Desktop.
Spark Job Scheduling

Factors pertaining to Spark job scheduling

Resource allocation

Speculation

spark.speculation: false (default). When turned on, Spark checks every spark.speculation.interval setting to determine stragglers.

Data locality

Spark tries to maintain a list of preferred locations for each partition. A partition’s preferred location is a list of hostnames or executors where the partition’s data resides so that computation can be moved closer to the data. This information is obtained for RDDs that are based on HDFS data. If Spark obtains a list of preferred locations, the Spark scheduler tries to run tasks on the executors where the data is physically present so that no data transfer is required. This can have a big impact on performance. There are five levels of data locality:

  • PROCESS_LOCAL — Execute a task on the executor that cached the partition.
  • NODE_LOCAL — Execute a task on the node where the partition is available.
  • RACK_LOCAL — Execute the task on the same rack as the partition if rack information is available in the cluster (currently only on YARN).
  • NO_PREF — No preferred locations are associated with the task.
  • ANY — Default if everything else fails.

If a task slot with the best locality for the task can’t be obtained (that is, all the matching task slots are taken), the scheduler waits a spark.locality.wait amount of time and then tries a location with the second-best locality, and so on.

Important configs: spark.locality.wait, spark.locality.wait.node etc for configuring specific locality

We can force the Spark scheduler to honor the desired locality level by waiting until it becomes available. Use a high value in situations where data locality is of critical importance. Say 10 minutes.

Memory constraints:

  • spark.storage.memoryFraction -- default 0.6 -- reserved memory for spark cached data storage
  • spark.shuffle.memoryFraction -- default 0.2 -- reserved for temporary shuffle data.
  • spark.storage.safetyFraction and spark.shuffle.safetyFraction -- default 0.9 -- to avoid OOM kills.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment