On Tue Oct 27, 2015, history.state.gov began buckling under load, intermittently issuing 500 errors. Nginx's error log was sprinkled with the following errors:
2015/10/27 21:48:36 [crit] 2475#0: accept4() failed (24: Too many open files)
2015/10/27 21:48:36 [alert] 2475#0: *7163915 socket() failed (24: Too many open files) while connecting to upstream...
An article at http://www.cyberciti.biz/faq/linux-unix-nginx-too-many-open-files/ provided directions that mostly worked. Below are the steps we followed. The steps that diverged from the article's directions are marked with an *.
- * Instead of using
ulimiton the nginx account, use
ps aux | grep nginxto locate nginx's process IDs. Then query each process's file handle limits using
pidis the process id retrieved from
sudomay be necessary on your system for the
catcommand here, depending on your system.)
fs.file-max = 70000to /etc/sysctl.conf
nginx soft nofile 10000and
nginx hard nofile 30000to /etc/security/limits.conf
worker_rlimit_nofile 30000;to /etc/nginx/nginx.conf.
- * While the directions suggested that
nginx -s reloadwas enough to get nginx to recognize the new settings, not all of nginx's processes received the new setting. Upon closer inspection of
/proc/pid/limits(see #1 above), the first worker process still had the original S1024/H4096 limit on file handles. Even
nginx -s quitdidn't shut nginx down. The solution was to kill nginx with the
kill pid. After restarting nginx, all of the nginx-user owned processes had the new file limit of S10000/H30000 handles.