### KERNEL TUNING ### | |
# Increase size of file handles and inode cache | |
fs.file-max = 2097152 | |
# Do less swapping | |
vm.swappiness = 10 | |
vm.dirty_ratio = 60 | |
vm.dirty_background_ratio = 2 | |
# Sets the time before the kernel considers migrating a proccess to another core | |
kernel.sched_migration_cost_ns = 5000000 | |
# Group tasks by TTY | |
#kernel.sched_autogroup_enabled = 0 | |
### GENERAL NETWORK SECURITY OPTIONS ### | |
# Number of times SYNACKs for passive TCP connection. | |
net.ipv4.tcp_synack_retries = 2 | |
# Allowed local port range | |
net.ipv4.ip_local_port_range = 2000 65535 | |
# Protect Against TCP Time-Wait | |
net.ipv4.tcp_rfc1337 = 1 | |
# Control Syncookies | |
net.ipv4.tcp_syncookies = 1 | |
# Decrease the time default value for tcp_fin_timeout connection | |
net.ipv4.tcp_fin_timeout = 15 | |
# Decrease the time default value for connections to keep alive | |
net.ipv4.tcp_keepalive_time = 300 | |
net.ipv4.tcp_keepalive_probes = 5 | |
net.ipv4.tcp_keepalive_intvl = 15 | |
### TUNING NETWORK PERFORMANCE ### | |
# Default Socket Receive Buffer | |
net.core.rmem_default = 31457280 | |
# Maximum Socket Receive Buffer | |
net.core.rmem_max = 33554432 | |
# Default Socket Send Buffer | |
net.core.wmem_default = 31457280 | |
# Maximum Socket Send Buffer | |
net.core.wmem_max = 33554432 | |
# Increase number of incoming connections | |
net.core.somaxconn = 65535 | |
# Increase number of incoming connections backlog | |
net.core.netdev_max_backlog = 65536 | |
# Increase the maximum amount of option memory buffers | |
net.core.optmem_max = 25165824 | |
# Increase the maximum total buffer-space allocatable | |
# This is measured in units of pages (4096 bytes) | |
net.ipv4.tcp_mem = 786432 1048576 26777216 | |
net.ipv4.udp_mem = 65536 131072 262144 | |
# Increase the read-buffer space allocatable | |
net.ipv4.tcp_rmem = 8192 87380 33554432 | |
net.ipv4.udp_rmem_min = 16384 | |
# Increase the write-buffer-space allocatable | |
net.ipv4.tcp_wmem = 8192 65536 33554432 | |
net.ipv4.udp_wmem_min = 16384 | |
# Increase the tcp-time-wait buckets pool size to prevent simple DOS attacks | |
net.ipv4.tcp_max_tw_buckets = 1440000 | |
net.ipv4.tcp_tw_recycle = 1 | |
net.ipv4.tcp_tw_reuse = 1 |
This comment has been minimized.
This comment has been minimized.
@marneu I agree. Every system is different, depending on your ethernet link, CPU and memory other settings could apply to you. |
This comment has been minimized.
This comment has been minimized.
To everyone directly copying this config file, the option net.ipv4.tcp_tw_recycle = 1 will very likely cause major issues and strange network behavior in virtualized environments / load balancers / firewalls where the observed behavior will be random stalls up to 15 seconds and during a packet capture packets will simply "disappear" Refer to https://www.speedguide.net/articles/linux-tweaking-121 to see if this setting is recommended/needed for you as there is no clear performance benefit when using this setting. |
This comment has been minimized.
This comment has been minimized.
Ugh. Try that with a fast Ethernet card, and see what happens on the receiving end... I think it will kill the receiver's ring buffer at 1 Gbps already. I would not go over 128K on the sending side of a fast link, not at 10G for sure.
So, the machine handles 64K clients, and is buffering up to 32MB for each on its output. 2 TB of write buffers. That's a lot of RAM and cores to support this, and a very, very large stack of 50Gbps cards. What is the point of handling this huge workload in a monstrous machine? Probably doable with a non-trivial PCIe topology, and throw in OmniPath interconnects, bit it still seems like a couple of 42U racks of smaller servers could do the same job with a much better redundancy at a lower cost, given that even 50GbE's are uncommon and expensive, and you do not build non-uniform PCIe topologies in your garage. Without an explanation why, this looks... well, quite an unusual configuration. |
This comment has been minimized.
This comment has been minimized.
tw_recycle is useful in virtualized environments if combination with its program that designed for reuse. |
This comment has been minimized.
This comment has been minimized.
comments here are helpless but confusing people. |
This comment has been minimized.
This comment has been minimized.
Why don't you provide the system parameters that you have used this configuration for, so it is better |
This comment has been minimized.
This comment has been minimized.
The fact is, tweaking kernel parameters has nuanced effects on your system. Everything has a cost-benefit tradeoff. Copy-pasting someone else's sysctl settings without understanding the implications will often make one's performance worse -- even if it worked well for the person who posted the config. There isn't a good one-sized-fits-all "performance" config. All that said, there are a few settings, like If behind a NAT device or clients use per-connection timestamp randomization (default in Linux 4.10+ and a few other OSes), you're likely to have problems. The Unfortunately, a lot of the kernel parameters are not well documented. So fully understanding the implications can be a non-trivial task. If you don't have time for research and experimentation, using |
This comment has been minimized.
This comment has been minimized.
@brandt, TIL about tuned, thanks! |
This comment has been minimized.
This comment has been minimized.
You mean unhelpful. I think, taken all together, the comments send a pretty reasonable message: just do not copy this configuration, it does not make sense. Sapienti sat. |
This comment has been minimized.
This comment has been minimized.
Never change kernel parameters on a production system without understanding the consequences and having tested it first. |
This comment has been minimized.
This comment has been minimized.
Better not to blindly copy paste this config. It's more advisable to override this configs when stumped upon a related error. |
This comment has been minimized.
This comment has been minimized.
net.ipv4.tcp_rfc1337 = 1 Actually complies with rfc1337, and does "not" provide the protection, needs to be set to 0. Info is in the source code. net.ipv4.tcp_tw_recycle =1 is dangerous will break connectivity to NAT users, this sysctl has now been removed in new kernels. Also the default rmem and wmem are set way too high, a recipe for resource exhaustion. |
This comment has been minimized.
It might be helpful to know more about the system, i.e. how much memory and what interface speed/bandwidth is given, to accept these values.