If I run 20 haywire processes using tcmalloc
haywire reaches 6.3 million requests/second
.
killall hello_world; for i in `seq 20`; do LD_PRELOAD="./lib/gperftools/.libs/libtcmalloc.so" ./build/hello_world --balancer reuseport & done
perf top
7.42% hello_world [.] http_request_buffer_pin
6.94% hello_world [.] http_request_buffer_reassign_pin
6.63% hello_world [.] http_parser_execute
6.23% libtcmalloc.so.4.3.0 [.] tc_deletearray_nothrow
4.94% hello_world [.] http_request_buffer_locate
3.60% libtcmalloc.so.4.3.0 [.] tc_malloc
2.77% libtcmalloc.so.4.3.0 [.] tc_calloc
2.33% [kernel] [k] native_queued_spin_lock_slowpath
2.31% [kernel] [k] tcp_sendmsg
1.76% [kernel] [k] _raw_spin_lock_bh
1.41% hello_world [.] http_request_on_message_complete
1.34% hello_world [.] set_header
1.22% [kernel] [k] copy_user_enhanced_fast_string
1.20% hello_world [.] create_response_buffer
Expanding http_parser_execute
6.84% hello_world [.] http_request_buffer_pin ▒
6.59% hello_world [.] http_parser_execute ▒
6.21% hello_world [.] http_request_buffer_reassign_pin ▒
6.08% libtcmalloc_minimal.so.4.3.0 [.] tc_deletearray_nothrow ▒
4.47% hello_world [.] http_request_buffer_locate ▒
3.62% libtcmalloc_minimal.so.4.3.0 [.] tc_malloc ▒
2.26% [kernel] [k] tcp_sendmsg ▒
1.82% [kernel] [k] _raw_spin_lock_bh ▒
1.44% hello_world [.] http_request_on_message_complete ▒
1.21% hello_world [.] create_response_buffer ▒
1.08% hello_world [.] hw_route_compare_method ▒
0.93% [kernel] [k] __srcu_read_lock ▒
0.76% libtcmalloc_minimal.so.4.3.0 [.] tc_realloc ▒
0.73% hello_world [.] free_http_request
If I run 1 process but 20 threads somewhere they are competing on something and only reach 3.2 million requests/second
.
LD_PRELOAD="./lib/gperftools/.libs/libtcmalloc.so" ./build/hello_world --threads 20 --balancer reuseport
perf top
10.68% hello_world [.] http_parser_execute
6.61% libtcmalloc.so.4.3.0 [.] tc_deletearray_nothrow
5.05% hello_world [.] http_request_buffer_pin
4.70% hello_world [.] http_request_buffer_reassign_pin
4.46% libtcmalloc.so.4.3.0 [.] tc_malloc
3.82% libc-2.21.so [.] 0x00000000001452a0
3.26% hello_world [.] http_request_buffer_locate
2.78% libtcmalloc.so.4.3.0 [.] tc_calloc
2.41% hello_world [.] uv_write2
2.30% libc-2.21.so [.] 0x000000000014d86c
2.13% libc-2.21.so [.] 0x000000000014d6b0
1.77% [kernel] [k] native_queued_spin_lock_slowpath
1.58% libc-2.21.so [.] strlen
1.44% hello_world [.] http_request_on_message_complete
Expanding http_parser_execute
.
6.65% libtcmalloc_minimal.so.4.3.0 [.] tc_deletearray_nothrow ▒
5.28% hello_world [.] http_request_buffer_pin ▒
4.60% hello_world [.] http_request_buffer_reassign_pin ▒
4.47% libtcmalloc_minimal.so.4.3.0 [.] tc_malloc ▒
3.27% libc-2.21.so [.] 0x00000000001452a0 ▒
3.26% hello_world [.] http_request_buffer_locate ▒
2.71% libtcmalloc_minimal.so.4.3.0 [.] tc_calloc ▒
2.12% libc-2.21.so [.] 0x000000000014d86c ▒
2.07% libc-2.21.so [.] 0x000000000014d6b0 ▒
1.95% hello_world [.] uv_write2 ▒
1.53% hello_world [.] http_request_on_message_complete ▒
1.44% [kernel] [k] tcp_sendmsg ▒
1.39% hello_world [.] get_cached_request