Skip to content

Instantly share code, notes, and snippets.

@tangoabcdelta
Last active January 11, 2021 21:34
Show Gist options
  • Save tangoabcdelta/3f40ca82fd37558005e278d8572a5531 to your computer and use it in GitHub Desktop.
Save tangoabcdelta/3f40ca82fd37558005e278d8572a5531 to your computer and use it in GitHub Desktop.
nginx 101 - making nginx easy for you; easy-peasy-lemon-squeezy

Permissions

  • nginx starts with root permissions
  • that's because any process that requires to run below port 1024 needs elevated privileges
  • The TCP/IP port numbers below 1024 are special in that normal users are not allowed to run servers on them. This is a security feaure, in that if you connect to a service on one of these ports you can be fairly sure that you have the real thing, and not a fake which some hacker has put up for you.

Child Processes

  • but then the trouble arrives when nginx wants to limit the privileges (because, an unrestricted process can do a lot of damage, and this is why deno doesn't even have file systems permissions when it starts)
  • this is why nginx spawns several child processes
  • the number of child processes is determined by the number(integer) mentioned against worker_processes, which in my case (default) is set to worker_processes auto; and auto is what you do when you do not know or do not want to know the number of CPU cores on the machine
  • Count the number of CPU cores on thy machine: grep processor /proc/cpuinfo | wc -l
  • the child processes are created with the username provided aginst the following field: user www-data;
  • so, after starting your nginx (e.g. say, by doing a sudo systemctl start nginx), you can check for the list of spawned child processes by executing ps aux | grep nginx in the terminal

Process ID

  • The following file will contain the process id for nginx so that you don't have to do a lot of manual mish-mashing
  • pid /run/nginx.pid;
  • so, just cat /run/nginx.pid; whenever you feel like
  • To see the master process: ps aux | grep nginx

Directives

nginx has two types of directives

  1. context directives e.g. events, http
  2. simple directives e.g. worker_connections

nginx conf explained

context directives
events context directive
worker_connections
  • tells you the maximum number of simultaneous connections that can be opened by a worker process.
  • you can sets it by providing an integer number value against the field
  • to get the total number of permitted connections, take this number and multiply with the number of cores
  • this number includes all connections i.e. it not only includes connections with clients but others e.g. connections with proxied servers
  • actual number of simultaneous connections cannot exceed the system imposed limit on the maximum number of open files
    • the system imposed limits are set at the OS level and have to be updated separately
    • or, by changing the worker_rlimit_nofile
  • note - it is common practice to run 1 worker process per core

The worker_connections directive tells our worker processes how many people can simultaneously be served by Nginx. The default value is 768; however, considering that every browser usually opens up at least 2 connections per server, this number can be halved. This is why we need to adjust our worker connections to its full potential. We can check our core’s limitations by issuing a ulimit command: ulimit -n

  sudo nano /etc/nginx/nginx.conf
worker_processes 1;
worker_connections 1024;

The http Block

The http block contains directives for handling web traffic. These directives are often referred to as universal because they are passed on to all website configurations NGINX serves.

What's inside the http context directive
sendfile on;
  • this is crucial if you want to send any large files
  • this speeds up large static file transfers by optimizing the data you send at once
  • if you set this sendfile to on then you must also set tcp_nopush to on i.e. tcp_nopush on;
  • tcp_nodelay on; -> not very sure yet
keepalive_timeout 65;
  • self explanatory
  • also, potential region to save some CPU
  • The client_body_timeout and client_header_timeout directives are responsible for the time a server will wait for a client body or client header to be sent after request. If neither a body or header is sent, the server will issue a 408 error or Request time out.
types_hash_max_size 2048;
  • NA

Logging paths

  access_log /var/log/nginx/access.log;
	error_log /var/log/nginx/error.log;
gzip_types
  • sample: gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;
  • the list of mime-types are the ones which nginx has been told to compress

How to save some cpu

  • if you're already performing the gzip during build time, then this step will be detrimental to CPU time
  • increasing the gzip_comp_level too high wil cause the server to waste cpu cycles
  • more on it in the compression gist

Virtual Host Paths and Sites-enabled paths

	##
	# Virtual Host Configs
	##

	include /etc/nginx/conf.d/*.conf;
	include /etc/nginx/sites-enabled/*;
Command to count the number of CPU cores on thy machine

grep processor /proc/cpuinfo | wc -l

To find the default nginx.conf location
/etc/nginx
/etc/nginx/nginx.conf
File locations in mac os
  • nginx.conf to /usr/local/etc/nginx/
  • default.conf and default-ssl.conf to /usr/local/etc/nginx/sites-available
  • homebrew.mxcl.nginx.plist to /Library/LaunchDaemons/
Command to start/stop nginx (1)
sudo systemctl start nginx
sudo service nginx start /stop / restart
sudo systemctl status nginx
Command to start/stop nginx (2)
$ cd /etc/nginx
$ sudo nginx -h

nginx version: nginx/1.14.0 (Ubuntu)
Usage: nginx [-?hvVtTq] [-s signal] [-c filename] [-p prefix] [-g directives]

Options:
  -?,-h         : this help
  -v            : show version and exit
  -V            : show version and configure options then exit
  -t            : test configuration and exit
  -T            : test configuration, dump it and exit
  -q            : suppress non-error messages during configuration testing
  -s signal     : send signal to a master process: stop, quit, reopen, reload
  -p prefix     : set prefix path (default: /usr/share/nginx/)
  -c filename   : set configuration file (default: /etc/nginx/nginx.conf)
  -g directives : set global directives out of configuration file
stop
  $ sudo nginx -s stop
  
  # if the process is not running, and you still try to stop it
  # then you'll get an error
  nginx: [error] open() "/run/nginx.pid" failed (2: No such file or directory)
start
  # to start
  $ sudo nginx
What does include /etc/nginx/mime.types; mean
What does try_files mean

try_files $uri $uri/ =404; means, when you receive a requrest for a particular resource, then try for the file first, if you can not find the file, then look for it in the sub-directory i.e. $uri/, if that still doesn't exist, then, return a =404 response

How to configure nginx.conf for rendering a SPA a.k.a. single page application (web)

server {
    server_name example.com;
    ...

    location ~ / {
        root /var/www/example.com/static;
        try_files $uri /index.html;
    }
}

** Explanation: **

  • try_files $uri /index.html; means force all paths to load either itself (js files and other static files) or go through index.html.
  • root /var/www/example.com/static; means use this particular over-ride present in the sub-directive
  • and treat /var/www/example.com/static as the root folder
  • You will find this setting commonly in the nginx servers which host a React or an Angular app.

What does server_name in server directive mean?

  • server_name is included in each server block in an nginx.conf file

  • /etc/nginx/sites-enabled/default file is no exception

  • the field basically means that the nginx server will listen for the said server_name

  • this will happen in conjunction with the port settings

  • typically server_name should contain your domain name, or alias for your domain name

  • e.g. example.com

server {
  ...
  server_name example.com;
  • the directive above means that if you reach the nginx server,

  • and if your domain value in the request header turns out to be example.com

  • then nginx will utilize the configuration present in this server block

    server {
        listen       80;
        server_name  example.org  www.example.org;
        ...
    }
    
    server {
        listen       80;
        server_name  *.example.org;
        ...
    }
    
    server {
        listen       80;
        server_name  mail.*;
        ...
    }
    
    server {
        listen       80;
        server_name  ~^(?<user>.+)\.example\.net$;
        ...
    }
    

What if the if name matches more than once?

When searching for a virtual server by name, if name matches more than one of the specified variants, e.g. both wildcard name and regular expression match, then the first matching variant will be chosen, in the following order of precedence:

  • exact name
  • longest wildcard name starting with an asterisk,
    • e.g. *.example.org
  • longest wildcard name ending with an asterisk,
    • e.g. mail.*
    • e.g. server_name *.example.org;,
    • or with some regex server_name ~^www\..+\.example\.org$;,
  • first matching regular expression (in order of appearance in a configuration file)
    • e.g. server_name ~^www\d+\.example\.net$;
server {
listen ${PORT};
server_name ${SERVER_NAME};
root /var/www/;
index index.html;
# This sets up nginx to replace parts of our JS code with environment
# variables injected into the container. This allows us to run the same
# code in different environments.
# The sub filter is used in CI to set up all the environment variables
# configured in manifold.
location ~* ^.+\.js$ {
LOCATION_SUB_FILTER
sub_filter_once off;
sub_filter_types *;
}
# Health check endpoint. This will be used by kubernetes to determine if the
# container is ready/alive.
location = /_healthz {
return 200 'OK';
}
# Force all paths to load either itself (js files) or go through index.html.
location / {
try_files $uri /index.html;
}
}

WTH is ps

  • ps stands for process status
  • aux lists processes by user names

RTFM

The man pages are here: https://man7.org/linux/man-pages/man1/ps.1.html

  • ps displays information about a selection of the active processes.
  • i.e. report a snapshot of the current processes.
  • If you want a repetitive update of the selection and the displayed information, use top instead.
  • Note that ps -aux is distinct from ps aux

Examples

  • To see every process on the system using standard syntax: ps -ef
  • Select all processes: -A is identical to -e
  • To see every process on the system using BSD syntax: ps ax or ps aux
  • To see every process running as root (real & effective ID) in user format: ps -U root -u root u

Save some CPU by picking-up a better nginx compression gzip level

  • if you're already performing the gzip during build time, then this step will be detrimental to CPU time
  • increasing the gzip_comp_level too high wil cause the server to waste cpu cycles
  • https://serverfault.com/q/253074
    • the level of gzip compression tells you how compressed the data is on a scale from 1-9
    • where, 9 is the most compressed.
    • more compression => more work to compress or decompress
    • the trade-off's impatc will be felt when the traffic goes up

HTTP headers

  • gzip-compressed HTTP content is served by the Content-Encoding: gzip header

    • if this header is being dropped somewhere (or stripped off at some proxy server)
    • then the client might not know to have to decompress the response
  • on the client's side 'Accept-Encoding: ' is used (more: https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html)

Accept-Encoding: compress, gzip
Accept-Encoding:
Accept-Encoding: *
Accept-Encoding: compress;q=0.5, gzip;q=1.0
Accept-Encoding: gzip;q=1.0, identity; q=0.5, *;q=0

Buffers

  • if the buffer sizes are too low, then Nginx will have to write to a temporary file
  • this will causing disk i/o (read and write) more frequently than required
  • client_body_buffer_size: This handles the client buffer size, meaning any POST actions sent to Nginx. POST actions are typically form submissions.
  • client_header_buffer_size: Similar to the previous directive, only instead it handles the client header size. For all intents and purposes, 1K is usually a decent size for this directive.
  • client_max_body_size: The maximum allowed size for a client request. If the maximum size is exceeded, then Nginx will spit out a 413 error or Request Entity Too Large.
  • large_client_header_buffers: The maximum number and size of buffers for large client headers.
Sample
client_body_buffer_size 10K;
client_header_buffer_size 1k;
client_max_body_size 8m;
large_client_header_buffers 2 1k;
How to find out which command or process is listening on a port?
using sudo
  • to execute these commands sudo permission is not required
  • to run any process at a port below 1024 (which are reserved ports), the server process has to be started with elevated privileges
  • most likely, that means as root (sudo)
  • so, to filter processes that are running at, say, port :80 (http) or :443 (https), we will need sudo
using lsof (a)
  • for some users netstat does not have the -p option

  • or, they're on linux

  • they can use lsof

  • Advantage: it displays both the Command and PID:

    lsof -i:<port number>
    lsof -i:8080
    lsof -i:3000
    sudo lsof -i:80
    sudo lsof -i:22
    

example:

  sudo lsof -i:80
  COMMAND     PID     USER   FD   TYPE  DEVICE SIZE/OFF NODE NAME
  nginx     11172     root    6u  IPv4 1680568      0t0  TCP *:http (LISTEN)
  nginx     11172     root    7u  IPv6 1680569      0t0  TCP *:http (LISTEN)
  nginx     12236 www-data    6u  IPv4 1680568      0t0  TCP *:http (LISTEN)
  nginx     12236 www-data    7u  IPv6 1680569      0t0  TCP *:http (LISTEN)
  nginx     12237 www-data    6u  IPv4 1680568      0t0  TCP *:http (LISTEN)
  nginx     12237 www-data    7u  IPv6 1680569      0t0  TCP *:http (LISTEN)
using lsof (b)
  sudo lsof -iTCP:<port number> -n -P

  sudo lsof -iTCP:80 -n -P
  COMMAND     PID     USER   FD   TYPE  DEVICE SIZE/OFF NODE NAME
  nginx     11172     root    6u  IPv4 1680568      0t0  TCP *:80 (LISTEN)
  nginx     11172     root    7u  IPv6 1680569      0t0  TCP *:80 (LISTEN)
  nginx     12236 www-data    6u  IPv4 1680568      0t0  TCP *:80 (LISTEN)
  nginx     12236 www-data    7u  IPv6 1680569      0t0  TCP *:80 (LISTEN)
  nginx     12237 www-data    6u  IPv4 1680568      0t0  TCP *:80 (LISTEN)
  nginx     12237 www-data    7u  IPv6 1680569      0t0  TCP *:80 (LISTEN)
  nginx     12238 www-data    6u  IPv4 1680568      0t0  TCP *:80 (LISTEN)
  nginx     12238 www-data    7u  IPv6 1680569      0t0  TCP *:80 (LISTEN)
  nginx     12239 www-data    6u  IPv4 1680568      0t0  TCP *:80 (LISTEN)
  nginx     12239 www-data    7u  IPv6 1680569      0t0  TCP *:80 (LISTEN)
  plugin_ho 23443     root   17u  IPv4 1491055      0t0  TCP 192.168.0.102:52790->45.55.41.223:80 (CLOSE_WAIT)
using lsof (c)
  sudo lsof -sTCP:<port number> -n -P
using lsof (d)

Another solution:

lsof -t -i :<port> -s <PROTO>:LISTEN

For example:

# lsof -i :22 -s TCP:LISTEN
COMMAND  PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
sshd    1392 root    3u  IPv4  19944      0t0  TCP *:ssh (LISTEN)
sshd    1392 root    4u  IPv6  19946      0t0  TCP *:ssh (LISTEN)
# lsof -t -i :22 -s TCP:LISTEN
1392
using lsof (e)

While lsof's -t is the simplest way to get the PID, lsof also has ways to select other fields using the -F option:

$ lsof -F'?'
lsof:	ID    field description
	 a    access: r = read; w = write; u = read/write
	 c    command name
	 d    device character code
	 D    major/minor device number as 0x<hex>
	 f    file descriptor (always selected)
	 G    file flaGs
	 i    inode number
	 k    link count
	 K    task ID (TID)
	 l    lock: r/R = read; w/W = write; u = read/write
	 L    login name
	 m    marker between repeated output
	 n    comment, name, Internet addresses
	 o    file offset as 0t<dec> or 0x<hex>
	 p    process ID (PID)
	 g    process group ID (PGID)
	 P    protocol name
	 r    raw device number as 0x<hex>
	 R    paRent PID
	 s    file size
	 S    stream module and device names
	 t    file type
	 T    TCP/TPI info
	 u    user ID (UID)
	 0    (zero) use NUL field terminator instead of NL

With output like so (note that PID and file descriptors are always printed):

$ sudo lsof -F cg -i :22 -s TCP:LISTEN 
p901
g901
csshd
f3
f4

So if you wanted the process group ID instead of the PID, you could do:

$ sudo lsof -F g -i :22 -s TCP:LISTEN | awk '/^g/{print substr($0, 2)}'
901
How to find out which command or process is listening on a port?
using netstat (a)
netstat -p
using netstat (c)

prints a comma delimited list of processes that are listening on a particular port

netstat -anlv |grep  -e <port number> | awk '{print $9}'| sort -nu| tr "\n" "," | sed s/,$//

for mac OS X users

netstat -anlv |grep -e Address -e <port number>`

it prints the process id along with a lot of other info because of the verbose -v flag

server {
  listen 80;
  listen [::]:80;

  server_name prod.myexample.com;
  root /var/www/PROD/myexample.com;

  location / {
    try_files $uri $uri/ /index.html =404;
  }
}


server {
  listen 80;
  listen [::]:80;

  server_name dev.myexample.com;
  root /var/www/DEV/myexample.com;

  location / {
    try_files $uri $uri/ /index.html =404;
  }
}
hosts configuration
  • to reach dev server, developers have to edit their /etc/hosts file.

  • ofcourse, in sudo mode

  • add the ip 12.34.56.789 as the mapping for dev.example.com

  • similarly do it for prod.example.com

    127.0.1.1 dev.myexample.com
    127.0.1.1 prod.myexample.com
    127.0.1.1 myexample.com
    127.0.0.1 localhost
    
  • i've done the mapping with the ip of my localhost because i'm testing locally

  • without the localhost mapping to these domains, i will not be able to re-direct to different apps running on port :80

Rule server_name dev.example.com;
  • the routing to different folders based on the domain name is made possible by the server_name directive
  • the domain mentioned here is matched against an incoming request
  • if they match, then the content is served from the root directory mentioned in the block
Rule root /var/www/PROD/example.com;
  • this will be the root director or root folder location for the server block we talked about previously
  • we will host our production code in PROD folder
  • and development code /DEV folder
  • the root directive takes care of informing nginx to serve content from different folders based on the server_name
  • whichever server block the name matches to, the root directory mentioned within that block is referred for serving content
Rule try_files $uri /index.html;
  • Force all paths to load either itself e.g. all static files, .js or .css etc. or go through index.html.
  • This part is KEY to making NGINX render a SPA site.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment