Skip to content

Instantly share code, notes, and snippets.

@RoyBellingan
Last active November 5, 2022 12:48
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save RoyBellingan/54e82fb2933db3982606a31b5d26e1e9 to your computer and use it in GitHub Desktop.
Save RoyBellingan/54e82fb2933db3982606a31b5d26e1e9 to your computer and use it in GitHub Desktop.
Jsonic vs the world
#include <boost/json.hpp>
#include <QDebug>
#include <QElapsedTimer>
#include <QFile>
#include "jsonic.c"
#include "jsonic.h"
#include "fmt/format.h"
#include "simdjson.h"
#include <iostream>
using namespace simdjson;
namespace bj = boost::json;
using namespace std;
int main(int argc, char* argv[]) {
QFile canada("canada.json");
canada.open(QFile::ReadOnly);
auto buffer = canada.readAll().toStdString();
string path = "/features/0/geometry/coordinates";
{
bj::monotonic_resource mr;
QElapsedTimer timer;
timer.start();
auto jv = bj::parse(buffer, &mr);
auto t1 = timer.nsecsElapsed();
auto& coordinates = jv.at_pointer(path).as_array();
auto size = coordinates.size();
auto t2 = timer.nsecsElapsed();
uint total = 0;
for (auto& inner : coordinates) {
total += inner.as_array().size();
}
auto t3 = timer.nsecsElapsed();
fmt::print(R"(
Boost.Json
{:<33}:{:>10}ns
{:<33}:{:>10}ns
{:<33}:{:>10}ns
{:<33}:{:>10}
{:<33}:{:>10}
)",
"Initial Read", t1,
"feature/0/geometry lenght", t2,
"feature/0/geometry unrolling", t3,
"external lenght", size,
"total number of element", total);
}
{
QElapsedTimer timer;
char* copy = buffer.data();
timer.start();
jsonic_node_t* root = jsonic_get_root(copy);
auto t1 = timer.nsecsElapsed();
jsonic_node_t* features = jsonic_object_get(copy, root, "features");
auto t2 = timer.nsecsElapsed();
jsonic_node_t* feature = jsonic_array_get(copy, features, 0);
auto t3 = timer.nsecsElapsed();
jsonic_node_t* geometry = jsonic_object_get(copy, feature, "geometry");
auto t4 = timer.nsecsElapsed();
jsonic_node_t* coordinates = jsonic_object_get(copy, geometry, "coordinates");
auto t5 = timer.nsecsElapsed();
int coords_length = jsonic_array_length(copy, coordinates);
auto t6 = timer.nsecsElapsed();
uint total = 0;
jsonic_node_t* coord = NULL;
for (;;) {
coord = jsonic_array_iter(copy, coordinates, coord, 0);
//coord = jsonic_array_iter_free(copy, coordinates, coord, 0);
if (coord->type == JSONIC_NONE)
break;
total += jsonic_array_length(copy, coord);
}
auto t7 = timer.nsecsElapsed();
fmt::print(R"(
JSONIC
{:<33}:{:>10}ns
{:<33}:{:>10}ns
{:<33}:{:>10}ns
{:<33}:{:>10}ns
{:<33}:{:>10}ns
{:<33}:{:>10}ns
{:<33}:{:>10}ns
{:<33}:{:>10}
{:<33}:{:>10}
)",
"Initial Read", t1,
"Feature", t2,
"feature/0", t3,
"feature/0/geometry", t4,
"feature/0/geometry/coordinates", t5,
"feature/0/geometry lenght", t6,
"feature/0/geometry unrolling", t7,
"external lenght", coords_length,
"total number of element", total);
}
{
ondemand::parser parser;
padded_string json = padded_string::load("canada.json");
QElapsedTimer timer;
timer.start();
ondemand::document canada = parser.iterate(json);
auto t1 = timer.nsecsElapsed();
uint size = 0;
uint total = 0;
for (auto&& v : canada.at_pointer(path).get_array()) {
size++;
for (auto&& v2 : v.get_array()) {
total++;
}
}
auto t2 = timer.nsecsElapsed();
fmt::print(R"(
SIMD Json
{:<33}:{:>10}ns
{:<33}:{:>10}ns
{:<33}:{:>10}
{:<33}:{:>10}
)",
"Initial Read", t1,
"feature/0/geometry lenght", t2,
"external lenght", size,
"total number of element", total);
}
}
@RoyBellingan
Copy link
Author

image

@rohanrhu
Copy link

rohanrhu commented Nov 3, 2022

Thankkk youuuu for benchamarking ❤️

Meowww the issue is jsonic_array_length() was the last thing I've written so I just used jsonic_get() to count arrays... I think it is still must be faster than Booooost's JSON library for most of real life usages.. If you are not using everything from the JSON, it must be uncomparable faster.

Buttttttttt I will re-write that stupid jsonic_array_length() then it will be close to SIMD JSON but buffering libraries are not even opponentsss.

@rohanrhu
Copy link

rohanrhu commented Nov 5, 2022

Re-engineered better jsonic_array_length() and jsonic_array_length_from() 🙀
rohanrhu/jsonic@49330f0

I think this must perform counting way better 😊

@rohanrhu
Copy link

rohanrhu commented Nov 5, 2022

Hi again, I've added more tricks :)

rohanrhu/jsonic@4b3ac34

@rohanrhu
Copy link

rohanrhu commented Nov 5, 2022

Meowww...

image

@RoyBellingan
Copy link
Author

Nid moar alignement!

Ciao, I have not used the make file you provided, and just #included both the C file to help the LTO https://gcc.gnu.org/onlinedocs/gccint/LTO-Overview.html so it was compiled already in O3.

So you are telling me that the slow part of json parsing is producing the dom, and probably for small size json, where you have to just read once some value, is just better to process on the fly...

So even if you have to count an element in a 2Mb json file near the end, is still better unrolling 400+ array before him...

Instead I will give a look to the other mod you have done, and redo the test above, and remove the qt dependency so you can execute easily on your side.

@rohanrhu
Copy link

rohanrhu commented Nov 5, 2022

Meowwww....

I can just write a "full read and return everything as a whole" function in 5 mins with iteration functions and it will provide a fixed good performance.

But.... I don't wanna do that; there are a lot of slow libraries that do that already.. why would I simply do that without an iteration ecosystem?

This one can already parse everything with array iteration functions and object/KV iteration functions. So it will result the same to all decent libraries; but why? It is different and advanced with iteration ecosystem. 😊

Purrr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment