Skip to content

Instantly share code, notes, and snippets.

@cphillips83
Forked from joewiz/post-mortem.md
Last active March 7, 2020 03:09
Show Gist options
  • Save cphillips83/ceaa1d87447d2e73ac8c0cb86a091f1f to your computer and use it in GitHub Desktop.
Save cphillips83/ceaa1d87447d2e73ac8c0cb86a091f1f to your computer and use it in GitHub Desktop.
Recovery from nginx "Too many open files" error on Amazon AWS Linux

https://www.blackmoreops.com/2014/09/25/find-number-of-unique-ips-active-connections-to-web-server/

On Tue Oct 27, 2015, history.state.gov began buckling under load, intermittently issuing 500 errors. Nginx's error log was sprinkled with the following errors:

2015/10/27 21:48:36 [crit] 2475#0: accept4() failed (24: Too many open files) 2015/10/27 21:48:36 [alert] 2475#0: *7163915 socket() failed (24: Too many open files) while connecting to upstream...

An article at http://www.cyberciti.biz/faq/linux-unix-nginx-too-many-open-files/ provided directions that mostly worked. Below are the steps we followed. The steps that diverged from the article's directions are marked with an *.

    • Instead of using su to run ulimit on the nginx account, use ps aux | grep nginx to locate nginx's process IDs. Then query each process's file handle limits using cat /proc/pid/limits (where pid is the process id retrieved from ps). (Note: sudo may be necessary on your system for the cat command here, depending on your system.)
  1. Added fs.file-max = 70000 to /etc/sysctl.conf
  2. Added nginx soft nofile 10000 and nginx hard nofile 30000 to /etc/security/limits.conf
  3. Ran sysctl -p
  4. Added worker_rlimit_nofile 30000; to /etc/nginx/nginx.conf.
    • While the directions suggested that nginx -s reload was enough to get nginx to recognize the new settings, not all of nginx's processes received the new setting. Upon closer inspection of /proc/pid/limits (see #1 above), the first worker process still had the original S1024/H4096 limit on file handles. Even nginx -s quit didn't shut nginx down. The solution was to kill nginx with the kill pid. After restarting nginx, all of the nginx-user owned processes had the new file limit of S10000/H30000 handles.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment