Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?
HTTP/2 server benchmark Jan 2015
listen: 3001
http2-max-concurrent-requests-per-connection: 1024
max-connections: 15000
num-threads: 1
hosts:
localhost:
paths:
/:
file.dir: /path/to/htdocs
listen: 3000
http2-max-concurrent-requests-per-connection: 1024
num-threads: 1
hosts:
localhost:
paths:
/:
proxy.reverse.url: http://127.0.0.1:3001
proxy.keepalive: ON
listen:
port: 3000
ssl:
certificate-file: /path/to/server.crt
key-file: /path/to/server.key
http2-max-concurrent-requests-per-connection: 1024
num-threads: 1
hosts:
localhost:
paths:
/:
proxy.reverse.url: http://127.0.0.1:3001
proxy.keepalive: ON
listen:
port: 3000
ssl:
certificate-file: /path/to/server.crt
key-file: /path/to/server.key
http2-max-concurrent-requests-per-connection: 1024
num-threads: 1
hosts:
localhost:
paths:
/:
file.dir: /path/to/htdocs
listen: 3000
http2-max-concurrent-requests-per-connection: 1024
num-threads: 1
hosts:
localhost:
paths:
/:
file.dir: /path/to/htdocs
proxy:
nghttpx -b127.0.0.1,3001 --frontend-no-tls
tls-proxy:
nghttpx -b127.0.0.1,3001 /path/to/server.key /path/to/server.crt
web:
nghttpd 3000 --no-tls -d /path/to/htdocs
tiny-nghttpd 127.0.0.1 3000 /path/to/htdocs
tls-web:
nghttpd 3000 /path/to/server.key /path/to/server.crt -d /path/to/htdocs

HTTP/2 server benchmark, Jan 2015 edition

We ran benchmark program called h2load against several HTTP/2 servers available as OSS. h2load is also an OSS benchmark/stress test tool for HTTP/2 and SPDY protocols and part of nghttp2 project.

3 implementations were tested: h2o, trusterd and nghttp2 servers (nghttpd, tiny-nghttpd and nghttpx). We ran tests against standalone web servers as well as HTTP/2 reverse proxy forwarding to backend HTTP/1 link.

The machine we used for this tests is Thinkpad X240 (because it was a fastest machine we had at this particular time). Its CPU is Intel(R) Core(TM) i7-4600U CPU running at 2.10GHz with AES-NI. It has 2 cores and each has HT, so 4 logical cores in total. Total memory is 8GB.

We made following kernel parameter tuning:

bash -c 'echo 1024 65535 > /proc/sys/net/ipv4/ip_local_port_range'
bash -c 'echo 1 > /proc/sys/net/ipv4/tcp_tw_reuse'
bash -c 'echo 10000 > /proc/sys/net/core/somaxconn'
ulimit -n 65536

We use Ubuntu 14.10 amd64.

trusterd (from commit ab6b2fc664156a486154fbdf64527ef41a7a35ba) was built with qrintf (from commit e2ce1fed6dc72d238cd221d40f992beb1c87fd9d).

h2o (from commit de58bb3ccfb10a47a80f462db356208159660f8e) was built by just running make without libuv and wslay. We are not sure qrintf was used or not.

nghttp2 (from commit f1049a66e2cd59a022be95cfe9f29537e95ab144) was built with clang-3.5 and clang++-3.5 with libc++ (-stdlib=libc++)

trusterd and h2o build used gcc 4.9.1.

h2o configurations: cleartext, TLS, proxy, proxy-TLS

trusterd configurations: cleartext, TLS

nghttp2 configurations: See here

For reverse proxy tests, h2o was used as backend server. Its configuration is here.

Here is the test results. For each benchmark, we ran the test 3 times and picked up the median.

Cleartext with flow control

h2load -n2000000 -c500 -m100

server 6 bytes 4K bytes
h2o 269523 163631
nghttpd 257174 140236
tiny-nghttpd 294228 157339
trusterd 259686 111046

Cleartext without flow control

h2load -n2000000 -c500 -m100 -w30 -W30

server 6 bytes 4K bytes
h2o 308311 -
nghttpd 257502 195260
tiny-nghttpd 292678 220162
trusterd 259756 111486

h2load reported some errors with h2o 4k test case, so we abandoned this particular test. We reported this issue to h2o developers.

TLS with flow control

h2load -n2000000 -c500 -m100

server 6 bytes 4K bytes
h2o 227865 78333
nghttpd 226716 80673
trusterd 62362 44020

tiny-nghttpd was omitted from TLS test cases, because it lacks TLS support.

TLS without flow control

h2load -n2000000 -c500 -m100 -w30 -W30

server 6 bytes 4K bytes
h2o 266338 -
nghttpd 227770 102708
trusterd 65283 46401

Again h2load reported some errors with h2o 4k test case, so we abandoned this particular test.

trusterd was struggling with TLS. We did not investigate the reason, but it may be related to too short TLS record size which increases TLS overhead and hurts performance.

Cleartext reverse proxy with flow control

h2load -n1000000 -c100 -m100

server 6 bytes 4K bytes
h2o 74173 46603
nghttpx 62405 40025

Cleartext reverse proxy without flow control

h2load -n1000000 -c100 -m100 -w30 -W30

server 6 bytes 4K bytes
h2o 75171 54294
nghttpx 65785 49311

TLS reverse proxy with flow control

h2load -n1000000 -c100 -m100

server 6 bytes 4K bytes
h2o 72984 37114
nghttpx 60222 35052

The backend HTTP/1 server was h2o.

TLS reverse proxy without flow control

h2load -n1000000 -c100 -m100 -w30 -W30

server 6 bytes 4K bytes
h2o 72760 40704
nghttpx 62404 41246

Again, the backend HTTP/1 server was h2o.

SERVER_NAME = "Trusterd"
SERVER_VERSION = "0.0.1"
SERVER_DESCRIPTION = "#{SERVER_NAME}/#{SERVER_VERSION}"
root_dir = "/path/to/htdocs"
s = HTTP2::Server.new({
:port => 3000,
:document_root => "#{root_dir}",
:server_name => SERVER_DESCRIPTION,
:tls => true,
:key => "/path/to/server.key",
:crt => "/path/to/server.crt"
})
s.run
SERVER_NAME = "Trusterd"
SERVER_VERSION = "0.0.1"
SERVER_DESCRIPTION = "#{SERVER_NAME}/#{SERVER_VERSION}"
root_dir = "/path/to/htdocs"
s = HTTP2::Server.new({
:port => 3000,
:document_root => "#{root_dir}",
:server_name => SERVER_DESCRIPTION,
:tls => false,
})
s.run
conf/trusterd.conf.rb (END)

kazuho commented Feb 2, 2015

Great benchmark! I tried to rerun some of the tests myself on AWS EC2.

I was able to reproduce the scores of H2O, but facing difficulties in reproducing the scores of nghttpd. Are there any parameter that need to be tweaked for nghttpd?

The configuration I used are https://gist.github.com/kazuho/8a42cab159dda582cd53#comment-1384900 The raw benchmark numbers that I got are also on the same gist.

Owner

tatsuhiro-t commented Feb 2, 2015

Compiling clang++ -stdlib=libc++ will boost performance of nghttpd.
For my measurement, binary compiled with clang++ and libc++ is noticeably faster than with libstdc++.

kazuho commented Feb 2, 2015

Thank you for the response. Using clang and libc++ showed noticeable performance gain. The table below is the results on cc3.8xlarge running Ubuntu 14.04 LTS.

server 6 bytes 4K bytes
H2O (gcc) 257,952 139,603
nghttpd (gcc) 131,707 70,934
nghttpd (clang,libc++) 164,522 84,229

The compiler versions used were:

$ gcc --version
gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2
Copyright (C) 2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ clang --version
Ubuntu clang version 3.4-1ubuntu3 (tags/RELEASE_34/final) (based on LLVM 3.4)
Target: x86_64-pc-linux-gnu
Thread model: posix
$ dpkg -s libc++1 | grep Version
Version: 1.0~svn199600-1

It's great benchmarks, thank!. I'm investigating the performance degradation of trusterd with TLS. I also ran the cleartext benchmarks using same configuration. So I let you know the result.

  • Ubuntu 14.04 64bit
  • Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz * 4 core assigned by VMWare
  • 8GB Memory
  • gcc
  • Cleartext with flow control
server 6 bytes 4k bytes
H2O 211,533 110,244
nghttpd 147,588 72,473
tiny-nghttpd 185,982 83,898
trusterd 198,482 90,145
h2load -c 500 -m 100 -n 2000000
$ gcc --version
gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2
Copyright (C) 2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
uname -a
Linux ubuntu140464 3.13.0-24-generic #47-Ubuntu SMP Fri May 2 23:30:00 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 63657
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 65536
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 63657
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
Owner

tatsuhiro-t commented Feb 3, 2015

Thanks. Seeing 3 results, I'm fairly confident that C++ program performs very differently depending on platform and standard libraries. On the other hand, pure C program is fairly stable.
Executable built with g++ is pretty slow right now, so don't use it to compile C++ servers. Just use clang++, 3.5 is the latest stable, and you say "Oh, it is a very easy optimization!". You get 20K+ more rps for free.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment