Factors pertaining to Spark job scheduling
spark.scheduler.mode
: FIFO (default) and Fair job scheduling.- Dynamic allocation: https://spark.apache.org/docs/latest/job-scheduling.html#dynamic-resource-allocation
spark.speculation
: false (default). When turned on, Spark checks every spark.speculation.interval setting to determine stragglers.
Spark tries to maintain a list of preferred locations for each partition. A partition’s preferred location is a list of hostnames or executors where the partition’s data resides so that computation can be moved closer to the data. This information is obtained for RDDs that are based on HDFS data. If Spark obtains a list of preferred locations, the Spark scheduler tries to run tasks on the executors where the data is physically present so that no data transfer is required. This can have a big impact on performance. There are five levels of data locality:
PROCESS_LOCAL
— Execute a task on the executor that cached the partition.NODE_LOCAL
— Execute a task on the node where the partition is available.RACK_LOCAL
— Execute the task on the same rack as the partition if rack information is available in the cluster (currently only on YARN).NO_PREF
— No preferred locations are associated with the task.ANY
— Default if everything else fails.
If a task slot with the best locality for the task can’t be obtained (that is, all the matching task slots are taken), the scheduler waits a
spark.locality.wait
amount of time and then tries a location with the second-best locality, and so on.
Important configs:
spark.locality.wait
, spark.locality.wait.node
etc for configuring specific locality
We can force the Spark scheduler to honor the desired locality level by waiting until it becomes available. Use a high value in situations where data locality is of critical importance. Say 10 minutes.
spark.storage.memoryFraction
-- default 0.6 -- reserved memory for spark cached data storagespark.shuffle.memoryFraction
-- default 0.2 -- reserved for temporary shuffle data.spark.storage.safetyFraction
andspark.shuffle.safetyFraction
-- default 0.9 -- to avoid OOM kills.