-
-
Save shipilev/8038826 to your computer and use it in GitHub Desktop.
*** Figuring out what "no vm operation" means here in -XX:+PrintSafepointStatistics: | |
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count | |
0.320: Deoptimize [ 7 0 0 ] [ 0 0 0 0 0 ] 0 | |
1.320: no vm operation [ 8 1 1 ] [ 0 0 0 0 0 ] 1 | |
4.061: EnableBiasedLocking [ 8 1 1 ] [ 0 0 0 0 0 ] 1 | |
5.222: RevokeBias [ 8 0 1 ] [ 0 0 0 0 0 ] 0 | |
5.831: RevokeBias [ 8 0 0 ] [ 0 0 0 0 0 ] 0 | |
6.440: RevokeBias [ 8 0 0 ] [ 0 0 0 0 0 ] 0 | |
7.049: RevokeBias [ 8 0 0 ] [ 0 0 0 0 0 ] 0 | |
7.659: RevokeBias [ 8 0 0 ] [ 0 0 0 0 0 ] 0 | |
8.267: RevokeBias [ 8 0 1 ] [ 0 0 0 0 0 ] 0 | |
8.875: RevokeBias [ 8 0 0 ] [ 0 0 0 0 0 ] 0 | |
9.482: RevokeBias [ 8 0 0 ] [ 0 0 0 0 0 ] 0 | |
10.091: RevokeBias [ 8 0 0 ] [ 0 0 0 0 0 ] 0 | |
10.700: RevokeBias [ 8 0 0 ] [ 0 0 0 0 0 ] 0 | |
11.308: RevokeBias [ 8 0 0 ] [ 0 0 0 0 0 ] 0 | |
11.354: no vm operation [ 6 1 1 ] [ 0 0 0 0 0 ] 0 | |
1) Getting our hands dirty! Let's patch the VM to figure out what is going on. | |
First of, we figure out that "no vm operation" is printed when vm_operation() is NULL. | |
So, let's proactively crash the VM on that condition. Second, let's see all the uses | |
of the safepoint enter sequence, and mark them up to figure out which one leads to NULL: | |
diff -r 5832cdaf89c6 src/share/vm/runtime/safepoint.cpp | |
--- a/src/share/vm/runtime/safepoint.cpp Mon Dec 16 08:24:33 2013 -0500 | |
+++ b/src/share/vm/runtime/safepoint.cpp Thu Dec 19 16:38:40 2013 +0400 | |
@@ -382,6 +382,10 @@ | |
// Update the count of active JNI critical regions | |
GC_locker::set_jni_lock_count(_current_jni_active_count); | |
+ if (VMThread::vm_operation() == NULL) { | |
+ ShouldNotReachHere(); | |
+ } | |
+ | |
if (TraceSafepoint) { | |
VM_Operation *op = VMThread::vm_operation(); | |
tty->print_cr("Entering safepoint region: %s", (op != NULL) ? op->name() : "no vm operation"); | |
diff -r 5832cdaf89c6 src/share/vm/runtime/vmThread.cpp | |
--- a/src/share/vm/runtime/vmThread.cpp Mon Dec 16 08:24:33 2013 -0500 | |
+++ b/src/share/vm/runtime/vmThread.cpp Thu Dec 19 16:38:40 2013 +0400 | |
@@ -285,6 +285,7 @@ | |
} | |
// 4526887 let VM thread exit at Safepoint | |
+ tty->print("MARK 1: %s\n", vm_operation() == NULL ? "null" : vm_operation()->name()); | |
SafepointSynchronize::begin(); | |
if (VerifyBeforeExit) { | |
@@ -453,6 +454,7 @@ | |
// 'GuaranteedSafepointInterval' milliseconds. This will run all | |
// the clean-up processing that needs to be done regularly at a | |
// safepoint | |
+ tty->print("MARK 2: %s\n", vm_operation() == NULL ? "null" : vm_operation()->name()); | |
SafepointSynchronize::begin(); | |
#ifdef ASSERT | |
if (GCALotAtAllSafepoints) InterfaceSupport::check_gc_alot(); | |
@@ -493,6 +495,7 @@ | |
_vm_queue->set_drain_list(safepoint_ops); // ensure ops can be scanned | |
+ tty->print("MARK 3: %s\n", vm_operation() == NULL ? "null" : vm_operation()->name()); | |
SafepointSynchronize::begin(); | |
evaluate_operation(_cur_vm_operation); | |
// now process all queued safepoint ops, iteratively draining | |
@@ -574,6 +577,7 @@ | |
bool max_time_exceeded = GuaranteedSafepointInterval != 0 && (interval > GuaranteedSafepointInterval); | |
if (SafepointALot || max_time_exceeded) { | |
HandleMark hm(VMThread::vm_thread()); | |
+ tty->print("MARK 4: %s\n", vm_operation() == NULL ? "null" : vm_operation()->name()); | |
SafepointSynchronize::begin(); | |
SafepointSynchronize::end(); | |
} | |
@@ -662,6 +666,7 @@ | |
_cur_vm_operation = op; | |
if (op->evaluate_at_safepoint() && !SafepointSynchronize::is_at_safepoint()) { | |
+ tty->print("MARK 5: %s\n", vm_operation() == NULL ? "null" : vm_operation()->name()); | |
SafepointSynchronize::begin(); | |
op->evaluate(); | |
SafepointSynchronize::end(); | |
2) Build the VM, run the any long workload: | |
# Fork: N/A, test runs in the existing VM | |
# Warmup: 1 iterations, 100 ms each | |
# Measurement: 5 iterations, 100 ms each | |
# Threads: 1 thread, will synchronize iterations | |
# Benchmark mode: Average time, time/op | |
# Running: org.openjdk.jmh.benchmarks.BlackholeConsumeCPUBench.consume_00001 | |
# Warmup Iteration 1: 2.304 ns/op | |
Iteration 1: 1.641 ns/op | |
Iteration 2: MARK 2: null | |
# To suppress the following error report, specify this argument | |
# after -XX: or in .hotspotrc: SuppressErrorAt=/safepoint.cpp:386 | |
# | |
# A fatal error has been detected by the Java Runtime Environment: | |
# | |
# Internal Error (/home/shade/trunks/hotspot-rt/hotspot/src/share/vm/runtime/safepoint.cpp:386), pid=8456, tid=140425471211264 | |
# Error: ShouldNotReachHere() | |
# | |
# JRE version: OpenJDK Runtime Environment (8.0) (build 1.8.0-internal-fastdebug-shade_2013_12_19_15_53-b00) | |
# Java VM: OpenJDK 64-Bit Server VM (25.0-b64-fastdebug mixed mode linux-amd64 compressed oops) | |
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again | |
# | |
# An error report file with more information is saved as: | |
# /home/shade/projects/jmh/hs_err_pid8456.log | |
# | |
# If you would like to submit a bug report, please visit: | |
# http://bugreport.sun.com/bugreport/crash.jsp | |
# | |
Current thread is 140425471211264 | |
Dumping core ... | |
Aborted (core dumped) | |
3) Got it, this is our assert. The last VM operation before null was "MARK 2". | |
Looking back at HotSpot code, we are stumbled with this block: | |
if (timedout && (SafepointALot || | |
SafepointSynchronize::is_cleanup_needed())) { | |
MutexUnlockerEx mul(VMOperationQueue_lock, | |
Mutex::_no_safepoint_check_flag); | |
// Force a safepoint since we have not had one for at least | |
// 'GuaranteedSafepointInterval' milliseconds. This will run all | |
// the clean-up processing that needs to be done regularly at a | |
// safepoint | |
tty->print("MARK 2: %s\n", vm_operation() == NULL ? "null" : vm_operation()->name()); | |
SafepointSynchronize::begin(); | |
#ifdef ASSERT | |
if (GCALotAtAllSafepoints) InterfaceSupport::check_gc_alot(); | |
#endif | |
SafepointSynchronize::end(); | |
} | |
4) Read out the comment, understand this is the collateral clean up VM op. | |
5) Let's confirm with -XX:GuaranteedSafepointInterval=90000000. | |
Done, no "no vm operation"-s in the log. |
Go ahead and submit a patch, Gleb.
Actually, real cleanup is in
void SafepointSynchronize::do_cleanup_tasks() {
jlong cleanup_time;
// Update fat-monitor pool, since this is a safepoint.
if (TraceSafepoint) {
cleanup_time = os::javaTimeNanos();
}
ObjectSynchronizer::deflate_idle_monitors();
InlineCacheBuffer::update_inline_caches();
if(UseCounterDecay && CounterDecay::is_decay_needed()) {
CounterDecay::decay();
}
NMethodSweeper::sweep();
if (TraceSafepoint) {
tty->print_cr("do_cleanup_tasks takes "INT64_FORMAT_W(6) "ms",
(os::javaTimeNanos() - cleanup_time) / MICROUNITS);
}
}
I see monitor deflation -- ok, understandable.
Clearing "inline caches" (http://code.metager.de/source/xref/openjdk/jdk6/hotspot/src/share/vm/code/icBuffer.cpp), and NMethodSweeper::sweep (http://cr.openjdk.java.net/~stefank/6989984.1/raw_files/old/src/share/vm/runtime/sweeper.cpp) seems like it manages JIt-ed methods blobs
By any way, GuaranteedSafepointInterval=1000 by default -- do we really need monitor deflation and JIT cleanup so often?
Go ahead and submit a patch, @cheremin.
Gosh! Shouldn't it be reflected in the logs properly?
(Go ahead and submit a patch, Gleb)