Skip to content

Instantly share code, notes, and snippets.

@danielmitterdorfer
Last active August 2, 2016 12:10
Show Gist options
  • Save danielmitterdorfer/9236796a46f3956447171313a6a0b365 to your computer and use it in GitHub Desktop.
Save danielmitterdorfer/9236796a46f3956447171313a6a0b365 to your computer and use it in GitHub Desktop.
JSON Parser Benchmark
/*
* Licensed to Elasticsearch under one or more contributor
* license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright
* ownership. Elasticsearch licenses this file to you under
* the Apache License, Version 2.0 (the "License"); you may
* not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package org.elasticsearch.benchmark.json;
import org.elasticsearch.common.xcontent.XContentParser;
import org.elasticsearch.common.xcontent.json.JsonXContent;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Warmup;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Map;
import java.util.concurrent.TimeUnit;
@Fork(3)
@Warmup(iterations = 20)
@Measurement(iterations = 20)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@State(Scope.Benchmark)
@SuppressWarnings("unused") //invoked by benchmarking framework
public class JsonParserBenchmark {
public String smallJson = "{\n" +
" \"query\": {\n" +
" \"match_all\": {}\n" +
" }\n" +
"}\n";
public byte[] smallJsonBytes = smallJson.getBytes(StandardCharsets.UTF_8);
public String largeJson = "{\n" +
" \"meta\": {\n" +
" \"short-description\": \"Standard benchmark in Rally (8.6M POIs from Geonames)\",\n" +
" \"description\": \"This test indexes 8.6M documents (POIs from Geonames, total 2.8 GB json) using 8 client threads and 5000 " +
"docs per bulk request against Elasticsearch\",\n" +
" \"data-url\": \"http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/geonames\"\n" +
" },\n" +
" \"indices\": [\n" +
" {\n" +
" \"name\": \"geonames\",\n" +
" \"types\": [\n" +
" {\n" +
" \"name\": \"type\",\n" +
" \"mapping\": \"mappings.json\",\n" +
" \"documents\": \"documents.json.bz2\",\n" +
" \"document-count\": 8647880,\n" +
" \"compressed-bytes\": 197857614,\n" +
" \"uncompressed-bytes\": 2790927196\n" +
" }\n" +
" ]\n" +
" }\n" +
" ],\n" +
" \"operations\": [\n" +
" {\n" +
" \"name\": \"index-append-default-settings\",\n" +
" \"type\": \"index\",\n" +
" \"index-settings\": {\n" +
" \"index.number_of_replicas\": 0\n" +
" },\n" +
" \"bulk-size\": 5000,\n" +
" \"force-merge\": true,\n" +
" \"clients\": {\n" +
" \"count\": 8\n" +
" }\n" +
" },\n" +
" {\n" +
" \"name\": \"index-append-fast-settings\",\n" +
" \"type\": \"index\",\n" +
" \"index-settings\": {\n" +
" \"index.number_of_replicas\": 0,\n" +
" \"index.refresh_interval\": \"30s\",\n" +
" \"index.number_of_shards\": 6,\n" +
" \"index.translog.flush_threshold_size\": \"4g\"\n" +
" },\n" +
" \"force-merge\": true,\n" +
" \"bulk-size\": 5000,\n" +
" \"clients\": {\n" +
" \"count\": 8\n" +
" }\n" +
" },\n" +
" {\n" +
" \"name\": \"index-append-update-fast-settings\",\n" +
" \"type\": \"index\",\n" +
" \"index-settings\": {\n" +
" \"index.number_of_replicas\": 0,\n" +
" \"index.refresh_interval\": \"30s\",\n" +
" \"index.number_of_shards\": 6,\n" +
" \"index.translog.flush_threshold_size\": \"4g\"\n" +
" },\n" +
" \"bulk-size\": 5000,\n" +
" \"force-merge\": true,\n" +
" \"conflicts\": \"sequential\",\n" +
" \"clients\": {\n" +
" \"count\": 8\n" +
" }\n" +
" },\n" +
" {\n" +
" \"name\": \"stats\",\n" +
" \"type\": \"stats\",\n" +
" \"warmup-iterations\": 100,\n" +
" \"iterations\": 100,\n" +
" \"clients\": {\n" +
" \"count\": 1\n" +
" }\n" +
" },\n" +
" {\n" +
" \"name\": \"search\",\n" +
" \"type\": \"search\",\n" +
" \"target-throughput\": 1,\n" +
" \"warmup-iterations\": 100,\n" +
" \"iterations\": 100,\n" +
" \"clients\": {\n" +
" \"count\": 1\n" +
" },\n" +
" \"queries\": [\n" +
" {\n" +
" \"name\": \"default\",\n" +
" \"body\": {\n" +
" \"query\": {\n" +
" \"match_all\": {}\n" +
" }\n" +
" }\n" +
" },\n" +
" {\n" +
" \"name\": \"term\",\n" +
" \"body\": {\n" +
" \"query\": {\n" +
" \"term\": {\n" +
" \"country_code\": \"AT\"\n" +
" }\n" +
" }\n" +
" }\n" +
" },\n" +
" {\n" +
" \"name\": \"phrase\",\n" +
" \"body\": {\n" +
" \"query\": {\n" +
" \"match_phrase\": {\n" +
" \"name\": \"Sankt Georgen\"\n" +
" }\n" +
" }\n" +
" }\n" +
" },\n" +
" {\n" +
" \"name\": \"country_agg_uncached\",\n" +
" \"cache\": false,\n" +
" \"body\": {\n" +
" \"size\": 0,\n" +
" \"aggs\": {\n" +
" \"country_population\": {\n" +
" \"terms\": {\n" +
" \"field\": \"country_code\"\n" +
" },\n" +
" \"aggs\": {\n" +
" \"sum_population\": {\n" +
" \"sum\": {\n" +
" \"field\": \"population\"\n" +
" }\n" +
" }\n" +
" }\n" +
" }\n" +
" }\n" +
" }\n" +
" },\n" +
" {\n" +
" \"name\": \"country_agg_cached\",\n" +
" \"cache\": true,\n" +
" \"body\": {\n" +
" \"size\": 0,\n" +
" \"aggs\": {\n" +
" \"country_population\": {\n" +
" \"terms\": {\n" +
" \"field\": \"country_code\"\n" +
" },\n" +
" \"aggs\": {\n" +
" \"sum_population\": {\n" +
" \"sum\": {\n" +
" \"field\": \"population\"\n" +
" }\n" +
" }\n" +
" }\n" +
" }\n" +
" }\n" +
" }\n" +
" },\n" +
" {\n" +
" \"name\": \"scroll\",\n" +
" \"query-type\": \"scroll\",\n" +
" \"pages\": 25,\n" +
" \"results-per-page\": 1000,\n" +
" \"body\": {\n" +
" \"query\": {\n" +
" \"match_all\": {}\n" +
" }\n" +
" }\n" +
" },\n" +
" {\n" +
" \"name\": \"expression\",\n" +
" \"body\": {\n" +
" \"query\": {\n" +
" \"function_score\": {\n" +
" \"query\": {\n" +
" \"match_all\": {}\n" +
" },\n" +
" \"functions\": [\n" +
" {\n" +
" \"script_score\": {\n" +
" \"script\": {\n" +
" \"inline\": \"(ln(abs(doc['population'])) + doc['elevation'] + doc['latitude']) * _score\",\n" +
" \"lang\": \"expression\"\n" +
" }\n" +
" }\n" +
" }\n" +
" ]\n" +
" }\n" +
" }\n" +
" }\n" +
" },\n" +
" {\n" +
" \"name\": \"painless_static\",\n" +
" \"body\": {\n" +
" \"query\": {\n" +
" \"function_score\": {\n" +
" \"query\": {\n" +
" \"match_all\": {}\n" +
" },\n" +
" \"functions\": [\n" +
" {\n" +
" \"script_score\": {\n" +
" \"script\": {\n" +
" \"inline\": \"(Math.log(Math.abs((int)((List)doc.population).get(0))) + (double)((List)doc.elevation)" +
".get(0) * (double)((List)doc.latitude).get(0))/_score\",\n" +
" \"lang\" : \"painless\"\n" +
" }\n" +
" }\n" +
" }\n" +
" ]\n" +
" }\n" +
" }\n" +
" }\n" +
" },\n" +
" {\n" +
" \"name\": \"painless_dynamic\",\n" +
" \"body\": {\n" +
" \"query\": {\n" +
" \"function_score\": {\n" +
" \"query\": {\n" +
" \"match_all\": {}\n" +
" },\n" +
" \"functions\": [\n" +
" {\n" +
" \"script_score\": {\n" +
" \"script\": {\n" +
" \"inline\": \"(Math.log(Math.abs(doc['population'].value)) + doc['elevation'].value * doc['latitude']" +
".value)/_score\",\n" +
" \"lang\" : \"painless\"\n" +
" }\n" +
" }\n" +
" }\n" +
" ]\n" +
" }\n" +
" }\n" +
" }\n" +
" } \n" +
" ]\n" +
" }\n" +
" ],\n" +
" \"challenges\": [\n" +
" {\n" +
" \"name\": \"append-no-conflicts\",\n" +
" \"description\": \"\",\n" +
" \"schedule\": [\n" +
" \"index-append-default-settings\",\n" +
" \"stats\",\n" +
" \"search\"\n" +
" ]\n" +
" },\n" +
" {\n" +
" \"name\": \"append-fast-no-conflicts\",\n" +
" \"description\": \"\",\n" +
" \"schedule\": [\n" +
" \"index-append-fast-settings\"\n" +
" ]\n" +
" },\n" +
" {\n" +
" \"name\": \"append-fast-with-conflicts\",\n" +
" \"description\": \"\",\n" +
" \"schedule\": [\n" +
" \"index-append-update-fast-settings\"\n" +
" ]\n" +
" }\n" +
"\n" +
" ]\n" +
"}\n" +
"\n";
public byte[] largeJsonBytes = largeJson.getBytes(StandardCharsets.UTF_8);
@Benchmark
public Map<String, Object> smallJson() throws IOException {
XContentParser xContentParser = JsonXContent.jsonXContent.createParser(smallJsonBytes);
return xContentParser.map();
}
@Benchmark
public Map<String, Object> largeJson() throws IOException {
XContentParser xContentParser = JsonXContent.jsonXContent.createParser(largeJsonBytes);
return xContentParser.map();
}
public static void main(String[] args) {
System.out.println(new JsonParserBenchmark().largeJsonBytes.length);
}
}

Benchmark Setup

  • System: Linux 4.6.4-1-ARCH
  • JVM: Oracle Java 1.8.0_92-b14
  • CPU: Intel(R) Xeon(R) CPU E3-1270 v5 @ 3.60GHz (CPU frequency for all cores locked at 3.4 GHz, performance CPU governor)

Invocation

taskset -c 0 java -jar elasticsearch-benchmarks-5.0.0-alpha5-SNAPSHOT-benchmarks.jar .*Json.*

Results

Below are the results of both configurations showing the average time for one iteration (smaller is better).

JsonParser.Feature.STRICT_DUPLICATE_DETECTION: false:

Benchmark                                             Mode  Cnt   Score   Error  Units
JsonParserBenchmark.largeJson  avgt   60  19.414 ± 0.044  us/op
JsonParserBenchmark.smallJson  avgt   60   0.479 ± 0.001  us/op

JsonParser.Feature.STRICT_DUPLICATE_DETECTION: true:

Benchmark                                             Mode  Cnt   Score   Error  Units
JsonParserBenchmark.largeJson  avgt   60  20.642 ± 0.064  us/op
JsonParserBenchmark.smallJson  avgt   60   0.487 ± 0.001  us/op

Interpretation

For smaller JSON objects (49 bytes) the overhead of duplication check is 8ns or 1.6%. For a large JSON object (6440 bytes) the overhead of duplication check is in the range 1.12us [1] and 1.3us [2] or in the range 5.8% and 6.7%.

[1] best case duplication check enabled 20.578 us, worst case duplication check enabled: 19.458 us)

[2] worst case duplication check enabled: 20.706 us, best case duplication check disabled: 19.370 us

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment