ab -n 5000 -c 100 http://10.86.36.223:8080/
Nginx service, using Lua to return "hello"
Requests per second: 3325.63 [#/sec] (mean)
Time per request: 30.069 [ms] (mean)
Time per request: 0.301 [ms] (mean, across all concurrent requests)
thread_1000_benchmark.py
sar -P ALL 1
09:22:07 PM CPU %user %nice %system %iowait %steal %idle
09:22:08 PM 0 71.93 0.00 1.75 0.00 17.54 8.77
09:22:09 PM 0 67.27 0.00 1.82 0.00 21.82 9.09
09:22:10 PM 0 65.31 0.00 2.04 0.00 22.45 10.20
09:22:11 PM 0 71.43 0.00 1.79 0.00 12.50 14.29
09:22:15 PM 0 85.29 0.00 1.47 0.00 2.94 10.29
09:22:23 PM 0 85.25 0.00 1.64 0.00 4.92 8.20
took 24.35 seconds
410.75 requests x second
average requests time 0.0024 s
It can be observed that only one core is being utilized, while the others remain idle at 0%. The user and steal percentages are relatively high, while the system usage is minimal. It is mostly CPU consumption, even though network waiting is not significant.
09:31:51 PM CPU %user %nice %system %iowait %steal %idle
09:31:52 PM 0 28.26 0.00 4.35 0.00 4.35 63.04
09:31:53 PM 0 13.75 0.00 2.50 0.00 7.50 76.25
09:31:53 PM 1 7.14 0.00 1.19 0.00 7.14 84.52
09:31:53 PM 2 6.82 0.00 1.14 0.00 4.55 87.50
09:31:55 PM 0 11.54 0.00 0.00 0.00 8.97 79.49
09:31:55 PM 1 7.32 0.00 0.00 0.00 8.54 84.15
09:31:55 PM 2 4.55 0.00 0.00 0.00 6.82 88.64
09:31:55 PM 3 4.35 0.00 1.09 0.00 4.35 90.22
09:31:56 PM 0 11.11 0.00 2.47 0.00 8.64 77.78
09:31:56 PM 1 12.79 0.00 1.16 0.00 5.81 80.23
09:31:56 PM 2 6.59 0.00 2.20 0.00 5.49 85.71
09:31:56 PM 3 5.49 0.00 0.00 0.00 6.59 87.91
09:32:02 PM 0 12.79 0.00 1.16 1.16 19.77 65.12
09:32:02 PM 1 8.79 0.00 2.20 0.00 18.68 70.33
09:32:02 PM 2 3.06 0.00 0.00 0.00 13.27 83.67
09:32:02 PM 3 11.34 0.00 1.03 1.03 16.49 70.10
09:32:09 PM 0 7.41 0.00 1.23 0.00 16.05 75.31
09:32:09 PM 1 9.20 0.00 1.15 0.00 10.34 79.31
09:32:09 PM 2 3.37 0.00 0.00 0.00 5.62 91.01
09:32:09 PM 3 4.44 0.00 1.11 0.00 5.56 88.89
09:32:22 PM 0 8.54 0.00 3.66 0.00 12.20 75.61
09:32:22 PM 1 7.23 0.00 2.41 0.00 7.23 83.13
09:32:22 PM 2 4.21 0.00 3.16 0.00 4.21 88.42
09:32:22 PM 3 8.05 0.00 2.30 0.00 6.90 82.76
took 30.50 seconds
327.91 requests x second
average requests time 0.0030 s
Overall performance has worsened, with other cores also showing CPU usage, but none reaching full capacity. System and iowait percentages are higher than before.
asyncio_1000_benckmark.py
09:49:33 PM CPU %user %nice %system %iowait %steal %idle
09:49:27 PM 0 77.05 0.00 3.28 0.00 11.48 8.20
09:49:28 PM 0 74.60 0.00 3.17 0.00 14.29 7.94
09:49:30 PM 0 81.48 0.00 5.56 0.00 3.70 9.26
09:49:33 PM 0 83.33 0.00 3.03 0.00 7.58 6.06
09:49:34 PM 0 78.95 0.00 3.51 0.00 7.02 10.53
took 14.25 seconds
702.00 requests x second
average requests time 0.0014 s
|python|async-workers| 14.25 s | 702.00 | 0.0014 s| 0.0014 s| (units in seconds here)
This is significantly faster than multithreading and, like before, only a single CPU is being used, with other CPUs idle at 0%. The system usage is higher than with multithreading.
09:55:58 PM CPU %user %nice %system %iowait %steal %idle
09:55:59 PM 2 73.00 0.00 4.00 0.00 1.00 22.00
09:56:01 PM 2 97.94 0.00 2.06 0.00 0.00 0.00
09:56:03 PM 2 97.96 0.00 2.04 0.00 0.00 0.00
09:56:06 PM 3 97.98 0.00 2.02 0.00 0.00 0.00
09:56:07 PM 3 98.00 0.00 2.00 0.00 0.00 0.00
09:56:10 PM 3 97.98 0.00 2.02 0.00 0.00 0.00
took 12.18 seconds
820.80 requests x second
average requests time 0.0012 s
|python|async-workers| 12.18 s | 820.80 | 1.2183 s| 0.0012 s| (units in seconds here)
This result is the fastest of all, with only a single CPU core being utilized at a high rate (100%), and there was even a switch between CPU cores during the process.
Thank you to Hashnode article for providing a comparative benchmark between asynchronous and threaded approaches in Python.