Skip to content

Instantly share code, notes, and snippets.

@markito
Created September 15, 2017 14:14
Show Gist options
  • Save markito/a8ffcdde8cf8ebb0e69ead2363902f06 to your computer and use it in GitHub Desktop.
Save markito/a8ffcdde8cf8ebb0e69ead2363902f06 to your computer and use it in GitHub Desktop.
Spark GC/memory settings
How much memory is permanently in memory vs how much is used for transformations
(ratio)
spark.storage.memoryFraction
Suggested settings... (need to debug the logs after these settings)
-XX:+UseG1GC -XX:+PrintFlagsFinal -XX:+PrintReferenceGC -verbose:gc
-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintAdaptiveSizePolicy
-XX:+UnlockDiagnosticVMOptions -XX:+G1SummarizeConcMark
-Xms88g -Xmx88g -XX:InitiatingHeapOccupancyPercent=35
-XX:ConcGCThread=15 -XX:+AlwaysPreTouch
Reasoning....
-XX:+UseG1GC - Enabling G1GC
--- This entire block is to enable debug information on Spark logs
-XX:+PrintFlagsFinal -XX:+PrintReferenceGC -verbose:gc
-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintAdaptiveSizePolicy
-XX:+UnlockDiagnosticVMOptions -XX:+G1SummarizeConcMark
--- Min/Max always the same size (I don't know how much RAM you have on those nodes)
-Xms88g -Xmx88g
--- In G1 you can trigger a quick collection a bit sooner with this flag avoiding the need of a full GC (Default is 45% - This is tuning down to 35%)
-XX:InitiatingHeapOccupancyPercent=35
--- If you do have enough CPU (how is CPU % utilization?) - This increase the number of GC threads
-XX:ConcGCThread=15
--- May take more time to start but for long running JVMs this will make the memory to be prefetched/touched initially rather than on the fly - Good for streaming.
-XX:+AlwaysPreTouch
If you don't have enough memory and would prefer to set the memory to be less than 32GB then also add:
-XX:+UseCompressedOops
Some of these settings can/need to be set on spark-env.sh but others on spark.executor.extraJavaOptions - I believe memory size on set-env and the tuning params on extraOptions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment