Skip to content

Instantly share code, notes, and snippets.

@kellabyte
Last active December 27, 2015 22:17
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kellabyte/bcbbfb35faba2168f3d4 to your computer and use it in GitHub Desktop.
Save kellabyte/bcbbfb35faba2168f3d4 to your computer and use it in GitHub Desktop.

Setup

Kestrel only supports 1 thread currently. I've included single threaded mode Haywire benchmarks and multi-threaded mode benchmarks for comparisons.

Kestrel 1 thread HTTP pipelining enabled

Running 10s test @ http://192.168.0.101:5000
  8 threads and 32 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    71.50ms   41.78ms 212.21ms   39.50%
    Req/Sec     1.70k   534.07     3.80k    77.39%
  Latency Distribution
     50%  134.62ms
     75%    0.00us
     90%    0.00us
     99%    0.00us
  130865 requests in 10.10s, 9.49MB read
Requests/sec:  12,959.85
Transfer/sec:      0.94MB

57,850,367,529 cpu-cycles                #    2.223 GHz                     [11.27%]
29,835,102,666 instructions              #    0.52  insns per cycle         [10.90%]
 1,968,628,510 cache-references          #   75.664 M/sec                   [11.11%]
    43,559,258 cache-misses              #    2.213 % of all cache refs     [11.01%]
 6,750,312,564 branch-instructions       #  259.448 M/sec                   [ 7.37%]
   254,557,318 branch-misses             #    3.77% of all branches         [ 7.75%]
 6,050,131,263 branch-loads              #  232.537 M/sec                   [ 7.69%]
   257,588,594 branch-load-misses        #    9.900 M/sec                   [ 7.54%]
 8,357,915,195 bus-cycles                #  321.237 M/sec                   [ 7.81%]
58,581,998,494 ref-cycles                # 2251.599 M/sec                   [11.54%]
  27906.022269 cpu-clock (msec)
  26017.948899 task-clock (msec)         #    1.329 CPUs utilized
        58,099 page-faults               #    0.002 M/sec
       225,636 context-switches          #    0.009 M/sec
         9,313 cpu-migrations            #    0.358 K/sec
        58,099 minor-faults              #    0.002 M/sec
             0 major-faults              #    0.000 K/sec
             0 alignment-faults          #    0.000 K/sec
             0 emulation-faults          #    0.000 K/sec
 9,810,599,074 L1-dcache-loads           #  377.070 M/sec                   [11.56%]
   703,387,558 L1-dcache-load-misses     #    7.17% of all L1-dcache hits   [11.32%]
 4,414,404,654 L1-dcache-stores          #  169.668 M/sec                   [ 7.71%]
   398,009,163 L1-dcache-store-misses    #   15.297 M/sec                   [ 7.67%]
    26,690,828 L1-dcache-prefetches      #    1.026 M/sec                   [ 7.44%]
36,495,235,686 L1-icache-loads           # 1402.695 M/sec                   [ 7.38%]
   630,152,694 L1-icache-load-misses     #    1.73% of all L1-icache hits   [ 7.41%]
   508,332,999 LLC-loads                 #   19.538 M/sec                   [ 7.49%]
    29,356,019 LLC-load-misses           #    5.77% of all LL-cache hits    [ 7.45%]
    75,456,562 LLC-stores                #    2.900 M/sec                   [ 7.39%]
     8,798,952 LLC-store-misses          #    0.338 M/sec                   [ 7.42%]
 9,673,200,533 dTLB-loads                #  371.790 M/sec                   [ 7.33%]
    97,687,644 dTLB-load-misses          #    1.01% of all dTLB cache hits  [ 7.55%]
 4,433,978,370 dTLB-stores               #  170.420 M/sec                   [ 7.71%]
    10,521,070 dTLB-store-misses         #    0.404 M/sec                   [ 7.48%]
29,165,019,893 iTLB-loads                # 1120.958 M/sec                   [11.37%]
    55,583,698 iTLB-load-misses          #    0.19% of all iTLB cache hits  [11.27%]

Kestrel 1 thread HTTP pipelining disabled

Running 10s test @ http://192.168.0.101:5000
  8 threads and 32 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    40.49ms    7.88ms 201.25ms   99.08%
    Req/Sec    98.99     19.94   121.00     64.67%
  Latency Distribution
     50%   39.99ms
     75%   40.03ms
     90%   40.08ms
     99%   40.37ms
  7583 requests in 10.10s, 564.83KB read
Requests/sec:    750.83
Transfer/sec:     55.93KB

Haywire 1 thread HTTP pipelining enabled

8 threads and 32 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     2.69ms    1.75ms  11.36ms   66.72%
    Req/Sec    52.34k     8.58k  203.64k    75.41%
  Latency Distribution
     50%    2.88ms
     75%    4.26ms
     90%    8.00ms
     99%    0.00us
  4173220 requests in 10.10s, 628.82MB read
Requests/sec: 413,206.67
Transfer/sec:     62.26MB

23,071,887,640 cpu-cycles                #    2.306 GHz                     [11.15%]
29,638,164,253 instructions              #    1.28  insns per cycle         [11.19%]
   520,226,399 cache-references          #   51.999 M/sec                   [11.22%]
     4,238,797 cache-misses              #    0.815 % of all cache refs     [11.21%]
 6,653,712,933 branch-instructions       #  665.065 M/sec                   [ 7.47%]
    64,976,673 branch-misses             #    0.98% of all branches         [ 7.47%]
 6,657,138,038 branch-loads              #  665.407 M/sec                   [ 7.47%]
    65,061,946 branch-load-misses        #    6.503 M/sec                   [ 7.46%]
 3,306,599,008 bus-cycles                #  330.508 M/sec                   [ 7.46%]
23,155,516,626 ref-cycles                # 2314.486 M/sec                   [11.19%]
  10004.679435 cpu-clock (msec)
  10004.606098 task-clock (msec)         #    0.721 CPUs utilized
           352 page-faults               #    0.035 K/sec
           623 context-switches          #    0.062 K/sec
             3 cpu-migrations            #    0.000 K/sec
           352 minor-faults              #    0.035 K/sec
             0 major-faults              #    0.000 K/sec
             0 alignment-faults          #    0.000 K/sec
             0 emulation-faults          #    0.000 K/sec
 8,100,094,960 L1-dcache-loads           #  809.637 M/sec                   [11.18%]
   213,222,575 L1-dcache-load-misses     #    2.63% of all L1-dcache hits   [11.18%]
 4,755,293,717 L1-dcache-stores          #  475.310 M/sec                   [ 7.45%]
   409,717,915 L1-dcache-store-misses    #   40.953 M/sec                   [ 7.45%]
    21,477,472 L1-dcache-prefetches      #    2.147 M/sec                   [ 7.44%]
19,141,109,830 L1-icache-loads           # 1913.230 M/sec                   [ 7.44%]
   150,883,211 L1-icache-load-misses     #    0.79% of all L1-icache hits   [ 7.44%]
   107,276,265 LLC-loads                 #   10.723 M/sec                   [ 7.45%]
     2,938,153 LLC-load-misses           #    2.74% of all LL-cache hits    [ 7.43%]
    95,005,408 LLC-stores                #    9.496 M/sec                   [ 7.37%]
       961,236 LLC-store-misses          #    0.096 M/sec                   [ 7.35%]
 8,044,235,336 dTLB-loads                #  804.053 M/sec                   [ 7.42%]
     5,778,690 dTLB-load-misses          #    0.07% of all dTLB cache hits  [ 7.42%]
 4,719,004,827 dTLB-stores               #  471.683 M/sec                   [ 7.42%]
     2,410,457 dTLB-store-misses         #    0.241 M/sec                   [ 7.41%]
29,687,921,594 iTLB-loads                # 2967.425 M/sec                   [11.11%]
       425,287 iTLB-load-misses          #    0.00% of all iTLB cache hits  [11.11%]

Haywire 1 thread HTTP pipelining disabled

Running 10s test @ http://192.168.0.101:8000
  8 threads and 32 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   593.99us  204.83us   3.98ms   75.89%
    Req/Sec     6.69k   805.83     8.38k    65.10%
  Latency Distribution
     50%  504.00us
     75%  575.00us
     90%    0.97ms
     99%    1.06ms
  537918 requests in 10.10s, 81.05MB read
Requests/sec:  53,259.95
Transfer/sec:      8.03MB

Haywire 8 threads HTTP pipelining enabled

Running 10s test @ http://192.168.0.101:8000
  8 threads and 32 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.10ms    0.86ms  12.34ms   53.85%
    Req/Sec   131.59k    39.54k  201.42k    46.08%
  Latency Distribution
     50%    1.74ms
     75%    0.00us
     90%    0.00us
     99%    0.00us
  10515500 requests in 10.10s, 1.55GB read
Requests/sec: 1,041,154.08
Transfer/sec:    156.88MB

93,697,507,434 cpu-cycles                #    2.295 GHz                     [11.27%]
76,313,348,362 instructions              #    0.81  insns per cycle         [11.39%]
 1,551,401,963 cache-references          #   37.997 M/sec                   [11.66%]
    40,124,729 cache-misses              #    2.586 % of all cache refs     [11.45%]
17,194,468,541 branch-instructions       #  421.125 M/sec                   [ 7.33%]
   194,260,504 branch-misses             #    1.13% of all branches         [ 7.33%]
17,216,776,802 branch-loads              #  421.671 M/sec                   [ 7.37%]
   194,296,549 branch-load-misses        #    4.759 M/sec                   [ 7.39%]
13,427,124,193 bus-cycles                #  328.856 M/sec                   [ 7.42%]
94,080,208,871 ref-cycles                # 2304.202 M/sec                   [11.18%]
  40830.468921 cpu-clock (msec)
  40829.850631 task-clock (msec)         #    2.224 CPUs utilized
           716 page-faults               #    0.018 K/sec
        72,790 context-switches          #    0.002 M/sec
         4,908 cpu-migrations            #    0.120 K/sec
           716 minor-faults              #    0.018 K/sec
             0 major-faults              #    0.000 K/sec
             0 alignment-faults          #    0.000 K/sec
             0 emulation-faults          #    0.000 K/sec
24,068,722,219 L1-dcache-loads           #  589.488 M/sec                   [11.17%]
   602,832,986 L1-dcache-load-misses     #    2.50% of all L1-dcache hits   [11.08%]
11,823,025,653 L1-dcache-stores          #  289.568 M/sec                   [ 7.48%]
 1,013,459,845 L1-dcache-store-misses    #   24.822 M/sec                   [ 7.70%]
    51,144,333 L1-dcache-prefetches      #    1.253 M/sec                   [ 7.48%]
79,605,746,640 L1-icache-loads           # 1949.695 M/sec                   [ 7.53%]
   459,230,790 L1-icache-load-misses     #    0.58% of all L1-icache hits   [ 7.67%]
   318,202,721 LLC-loads                 #    7.793 M/sec                   [ 7.60%]
    30,829,377 LLC-load-misses           #    9.69% of all LL-cache hits    [ 7.49%]
   256,592,407 LLC-stores                #    6.284 M/sec                   [ 7.35%]
     7,348,077 LLC-store-misses          #    0.180 M/sec                   [ 7.24%]
24,047,935,746 dTLB-loads                #  588.979 M/sec                   [ 7.34%]
    24,568,991 dTLB-load-misses          #    0.10% of all dTLB cache hits  [ 7.44%]
11,737,829,932 dTLB-stores               #  287.482 M/sec                   [ 7.30%]
    10,916,030 dTLB-store-misses         #    0.267 M/sec                   [ 7.39%]
76,125,182,287 iTLB-loads                # 1864.449 M/sec                   [11.10%]
     1,499,436 iTLB-load-misses          #    0.00% of all iTLB cache hits  [11.18%]

Haywire 8 threads HTTP pipelining disabled

Running 10s test @ http://192.168.0.101:8000
  8 threads and 32 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   286.40us  190.71us   7.67ms   89.60%
    Req/Sec    14.06k     3.56k   22.02k    62.87%
  Latency Distribution
     50%  234.00us
     75%  352.00us
     90%  482.00us
     99%  795.00us
  1130289 requests in 10.10s, 170.31MB read
Requests/sec: 111,912.30
Transfer/sec:     16.86MB
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment