vkz/hoot.md

## hoot.md

      
    Raw
  

              hoot.md
            
          
    Calibrating and troubleshooting your web-server (Jetty) performance


Before git.ht got released I thought it'd be good to get an estimate of how much load it could handle. rps is the obvious first mentric that comes to mind,  but then I caught myself thinking that I honestly had no idea what it was supposed to be. My brain wasn't calibrated to even make a guess. I did make a guess and as always it turned out way off. What do you expect it to be? 1K? 5K? 50K? 100K? 1mio?
What follows isn't some comprehensive blogpost but rather a hoot - a gist of my findings and observations at the time - the kind of impromptu publication git.ht was created for. Don't expect any deep insights or thorough analysis.
To give some context. This was pretty vanilla Clojure ring with Jetty project. Nothing fancy. Well, almost. This did rely on my homegrown web cough framework cough which does CGI style dispatch, since I don't buy into all of this modern day routing mumbo jumbo (people often forget what dynamic in their favorite dynamic language stands for). So, this will be of some interest to the Clojure, GNU Guix, Java and perf crowd (I now wonder how tiny the intersection is).
With benefit of hindsight and no mention in the following notes, if you aren't using JMX and JFR to monitor and profile your Clojure project, you may want to spend some time learning the tools of the boring Java enterprise world. I hate to break it to you, kids, Java has all the cool toys.
Setting the stage: problem with requests per second

Was getting circa 2000-5000 rqs stresstesting with wrk and ab e.g. the following commands:
wrk --latency -t 100 -c 1000 -d30s http://192.168.1.51:6060/
ab -n 30000 -c 400 http://192.168.1.51:6060/

NOTE these two will show vastly different request per sec rates. I suspect they issue requests and measure differently. E.g. when wrk shows circa 30K, ab will show around 7K.
Clojure people seem to prefer wrk2 for testing if only because it works with -H "Connection: close" header, while wrk breaks.
Links:

wrk
ab
wrk2

Solution

Discussed in Clojure#performance Slack channel.
After profiling, pinned to massive logging. Specifically, because my logback had root logger set to log everything. Changing one line <root level='all'> to <root level='ERROR'> brought requests per sec from mere 5K to over 30K on my 4CPU Guix laptop. To keep namespaces we care about logging will require setting their respective logging levels to e.g. DEBUG:
<root level= 'ERROR'></root>
<logger level='DEBUG' name='fullmeta'>
<appender-ref ref='STDOUT'>
</appender-ref>
</logger>
<logger level='DEBUG' name='twgt'>
<appender-ref ref='STDOUT'>
</appender-ref>
</logger>

To see just how much stuff is being logged (and thrown away without appender for root) add an appender and marvel at the amount of crap Jetty alone logs. It is insane that we were getting even 5K.
Notes from the discussion

Perf comparison of wide range of known web apps
Comments from Rupert re general perf considerations:
A couple of suggestions:

Use HTOP to see if you are maxing out your CPU.
- Maxed out userspace CPU usage = code/libraries is bottleneck
- Maxed out kernelspace CPU usage = network adapater / operating system is bottleneck
- Not maxed out CPU = Bottleneck elsewhere - you should be at 100% CPU usage.

Use BMON to see if you are maxing out your network connection.
If you are on are on mac and windows setting up the TCP connection can be a bottleneck too.

Also make sure you have set the number of threads for your jetty server - e.g. 64 threads on 4 cores may be about right.


Ben profiled with clj-async-profiler but could've used JFR or VisualVM or Mission Control or whatever. He just looked at flamegraph and noticed logging was flooding everything.
Other problems and considerations

Jetty threads

I think out of the box ring-jetty-adapter sets max threads to 50, so maybe worth increasing.
SYN flooding

When I tried matching command that Justine used to show off her redbean2 server, I got a bunch of timeouts instead of replies and peek into /var/log/messages on revealed a warning about possible SYN flooding attack. Looks like Linux kernel defaults to mitigating such attacks: wrk --latency -t 10000 -c 10000

SYN flood
possible SYN flooding and how to disable

Max connections

Prompted partly by this link and post re getting to 600K open concurrent connections with http-kit.
Part of it is how many open file descriptors can is your application allowed to have. All limits can be seen with ulimit -a for the current user shell, or specifically for open files aka nofile with ulimit -n. I may want to increase this to something silly like 400K or something.
The nofile limit can be tweaked at system level (I guess per user) and at process level. E.g. Shepherd's own make-forkexec-constructor lets you specify #:resource-limits and in fact bumps the default nofile to 4K from the default per-user 1024, at least looking at the limits inside running process: cat /proc/PID/limits. At the system level, I think these are meant to be set in base services - pam-limits-service, though I'm not sure.
More generally these are called Linux pam_limits. Links:

pam_limits
limits.conf and what limits are available to be set - what I can supply to e.g. Shepherd service or pam-limits-service

Also, related, I wonder if I should figure out a way for the server to automatically drop connection after it sends Atom XML feed to clients. Misconfigured RSS clients may keep connections alive and that could potentially eat into my server process ulimit file descriptor limit quota. Something to consider.
Coda


Did you know I'm running a (mostly) Clojure SWAT team in London at fullmeta.co.uk? We are always on the lookout for interesting contracts. We have two exceptional engineers available as I type these words. Now that git.ht is up, which you should go ahead and check out rigth now, I'll be going back to CTOing and contracting, which makes me available for hire, too. Get in touch directly or find us on LinkedIn. We aren't cheap, but we're very good. We can also perform competently in modern Java, ES6/TS, Go, C#, etc.

Any comments?

As always this hoot is nothing more than a GitHub gist. Please, use its respective comments section if you have something to say. Till next time.