lenards/jvm_logging.md Secret

## jvm_logging.md

      
    Raw
  

              jvm_logging.md
            
          
    Quick thoughts that escaped my head ...

A Cassandra service might have the following included in its' "shell initialization" script:
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps
-XX:+PrintHeapAtGC
-XX:+PrintTenuringDistribution
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintPromotionFailure
-Xloggc:/var/log/cassandra/gc.log
-XX:+UseGCLogFileRotation
-XX:NumberOfGCLogFiles=10
-XX:GCLogFileSize=10M

source: cassandra/jvm.option
So, to enable logging, make it rotate, and give that rotation some options:
...
-Xloggc:/var/log/<discovery-environment-service-name>/gc.log
-XX:+UseGCLogFileRotation
-XX:NumberOfGCLogFiles=10
-XX:GCLogFileSize=10M

The important information from the JVM to log...
-XX:+PrintGCDetails    # more verbose information regarding the collection
-XX:+PrintGCDateStamps # this gives you "dates" instead of UNIX timestamps

The next options are more related to what you'd like to know when you start tuning:
-XX:+PrintTenuringDistribution     # information about objects as move from generation to generation
-XX:+PrintGCApplicationStoppedTime # important ** it will help find the worst full-stop GCs
-XX:+PrintPromotionFailure         # help denote the cause of full garbage collections

The -XX:+PrintGCApplicationStoppedTime helped me zero in on GCs that were not OutOfMemory, but causing disturbing long GC pauses.
Once you have a problem, then you want to consider getting a "heap dump" of an active service. Or, you want to go harvest any dumps that you got when a service failed because of "OutOfMemory" and created a dump:
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=<path>

JVM Logging

A quick reference (somewhat old) for the different options for logging in the JVM can be found here (in "GC logging options" section).
Tuning ...

The amount of effort that goes into tuning ParNew/CMS can be seen in CASSANDRA-8150. The new default for Cassandra is G1GC.
A nice outline of the processing of tuning the JVM can be found here
Including "agents"

You can add something like jHiccup to help measure pauses.
There is an example of the jamm agent being added in cassandra.sh.
Also, you might want to include a Metrics jar for taking measurements like metrics-core.