Skip to content

Instantly share code, notes, and snippets.

@joewiz
Last active September 3, 2023 11:57
Show Gist options
  • Save joewiz/4c39c9d061cf608cb62b to your computer and use it in GitHub Desktop.
Save joewiz/4c39c9d061cf608cb62b to your computer and use it in GitHub Desktop.
Recovery from nginx "Too many open files" error on Amazon AWS Linux

On Tue Oct 27, 2015, history.state.gov began buckling under load, intermittently issuing 500 errors. Nginx's error log was sprinkled with the following errors:

2015/10/27 21:48:36 [crit] 2475#0: accept4() failed (24: Too many open files)

2015/10/27 21:48:36 [alert] 2475#0: *7163915 socket() failed (24: Too many open files) while connecting to upstream...

An article at http://www.cyberciti.biz/faq/linux-unix-nginx-too-many-open-files/ provided directions that mostly worked. Below are the steps we followed. The steps that diverged from the article's directions are marked with an *.

  1. * Instead of using su to run ulimit on the nginx account, use ps aux | grep nginx to locate nginx's process IDs. Then query each process's file handle limits using cat /proc/pid/limits (where pid is the process id retrieved from ps). (Note: sudo may be necessary on your system for the cat command here, depending on your system.)
  2. Added fs.file-max = 70000 to /etc/sysctl.conf
  3. Added nginx soft nofile 10000 and nginx hard nofile 30000 to /etc/security/limits.conf
  4. Ran sysctl -p
  5. Added worker_rlimit_nofile 30000; to /etc/nginx/nginx.conf.
  6. * While the directions suggested that nginx -s reload was enough to get nginx to recognize the new settings, not all of nginx's processes received the new setting. Upon closer inspection of /proc/pid/limits (see #1 above), the first worker process still had the original S1024/H4096 limit on file handles. Even nginx -s quit didn't shut nginx down. The solution was to kill nginx with the kill pid. After restarting nginx, all of the nginx-user owned processes had the new file limit of S10000/H30000 handles.
@ryancheung
Copy link

tks!

@mskian
Copy link

mskian commented Dec 19, 2018

Thanks a lot, it works 💯
Awesome...!

@berenddeboer
Copy link

If you use systemd, instead of 2 you need to create an override and put LimitNOFILE=10000 in the [Service] section.

@ldrrp
Copy link

ldrrp commented Jul 23, 2019

This solved my issue, in the process i noticed it only hit this limit because one of a default setting of 300s timeout. It was backing up connections which is why it hit the limit. I lowered timeout to 5s and also fixed an issue on the server it was sending to that had a issue of its own that caused the overall backup

@jmg
Copy link

jmg commented Sep 9, 2020

Thanks for making this guide! Just one thing:

In point 5:
Added worker_rlimit_nofile 30000; to /etc/nginx/nginx.conf.

Please notice that:
The worker_rlimit_nofile 30000 directive goes in the main context of the config file. Outside server {...} and events {...}. Reference: http://nginx.org/en/docs/ngx_core_module.html#worker_rlimit_nofile

I was struggling with this because I tried to add it on the server {...} context and this didn't fail but it wasn't increasing the max open files.

@jianhaiqing
Copy link

awesome

@claytonrothschild
Copy link

Worked for us. Thanks a lot. Also, @jmg has an important insight

@marensas
Copy link

  1. service nginx restart succeeded. No need to manually kill processes by PID.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment