Hypothesis: a recent change to the code caused a problem which would be visible under load as dropped spans.
Validation approach: create a lot of load and check if there are any dropped spans
Conclusion: Hypothesis isn't supported. there could be a different explanation for dropped spans, possibly data in nature.
Changed brave-webmvc-example to make a lot of separate requests instead of buffering.
diff --git a/webmvc4-boot/src/main/java/brave/webmvc/TracingConfiguration.java b/webmvc4-boot/src/main/java/brave/webmvc/TracingConfiguration.java
index a3e914c..6c4a979 100644
--- a/webmvc4-boot/src/main/java/brave/webmvc/TracingConfiguration.java
+++ b/webmvc4-boot/src/main/java/brave/webmvc/TracingConfiguration.java
@@ -43,7 +43,9 @@ public class TracingConfiguration extends WebMvcConfigurerAdapter {
/** Configuration for how to buffer spans into messages for Zipkin */
@Bean AsyncReporter<Span> spanReporter() {
- return AsyncReporter.create(sender());
+ return AsyncReporter.builder(sender())
+ .messageMaxBytes(512)
+ .build();
}
/** Controls aspects of tracing such as the service name that shows up in the UI */
Run the zipkin server with autocomplete indexing
$ AUTOCOMPLETE_KEYS=http.method,environment STORAGE_TYPE=cassandra3 java -jar zipkin.jar
Once the services are running, use a high load to help flush out any concurrency problems.
$ wrk -t4 -c128 -d1m http://localhost:8081 --latency
Running 1m test @ http://localhost:8081
4 threads and 128 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 68.50ms 77.77ms 1.38s 88.76%
Req/Sec 622.08 243.02 1.27k 63.06%
Latency Distribution
50% 46.64ms
75% 90.21ms
90% 155.38ms
99% 369.99ms
147480 requests in 1.00m, 20.00MB read
Socket errors: connect 0, read 78, write 2, timeout 0
Requests/sec: 2456.91
Transfer/sec: 341.14KB
$ wrk -t4 -c128 -d1m http://localhost:8081 --latency
Running 1m test @ http://localhost:8081
4 threads and 128 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 36.35ms 33.26ms 432.87ms 80.94%
Req/Sec 1.04k 166.28 1.59k 71.44%
Latency Distribution
50% 28.25ms
75% 50.29ms
90% 78.84ms
99% 154.21ms
248995 requests in 1.00m, 33.76MB read
Requests/sec: 4149.26
Transfer/sec: 576.14KB
Verify statistics report no dropped spans
$ curl -s localhost:9411/metrics|jq .
{
"counter.zipkin_collector.messages.http": 102682,
"counter.zipkin_collector.spans_dropped.http": 0,
"gauge.zipkin_collector.message_bytes.http": 253,
"counter.zipkin_collector.bytes.http": 38760897,
"gauge.zipkin_collector.message_spans.http": 1,
"counter.zipkin_collector.spans.http": 117899,
"counter.zipkin_collector.messages_dropped.http": 0
}