Created
September 15, 2017 14:14
-
-
Save markito/a8ffcdde8cf8ebb0e69ead2363902f06 to your computer and use it in GitHub Desktop.
Spark GC/memory settings
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
How much memory is permanently in memory vs how much is used for transformations | |
(ratio) | |
spark.storage.memoryFraction | |
Suggested settings... (need to debug the logs after these settings) | |
-XX:+UseG1GC -XX:+PrintFlagsFinal -XX:+PrintReferenceGC -verbose:gc | |
-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintAdaptiveSizePolicy | |
-XX:+UnlockDiagnosticVMOptions -XX:+G1SummarizeConcMark | |
-Xms88g -Xmx88g -XX:InitiatingHeapOccupancyPercent=35 | |
-XX:ConcGCThread=15 -XX:+AlwaysPreTouch | |
Reasoning.... | |
-XX:+UseG1GC - Enabling G1GC | |
--- This entire block is to enable debug information on Spark logs | |
-XX:+PrintFlagsFinal -XX:+PrintReferenceGC -verbose:gc | |
-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintAdaptiveSizePolicy | |
-XX:+UnlockDiagnosticVMOptions -XX:+G1SummarizeConcMark | |
--- Min/Max always the same size (I don't know how much RAM you have on those nodes) | |
-Xms88g -Xmx88g | |
--- In G1 you can trigger a quick collection a bit sooner with this flag avoiding the need of a full GC (Default is 45% - This is tuning down to 35%) | |
-XX:InitiatingHeapOccupancyPercent=35 | |
--- If you do have enough CPU (how is CPU % utilization?) - This increase the number of GC threads | |
-XX:ConcGCThread=15 | |
--- May take more time to start but for long running JVMs this will make the memory to be prefetched/touched initially rather than on the fly - Good for streaming. | |
-XX:+AlwaysPreTouch | |
If you don't have enough memory and would prefer to set the memory to be less than 32GB then also add: | |
-XX:+UseCompressedOops | |
Some of these settings can/need to be set on spark-env.sh but others on spark.executor.extraJavaOptions - I believe memory size on set-env and the tuning params on extraOptions. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment