Skip to content

Instantly share code, notes, and snippets.

@voluntas
Forked from techgaun/sysctl.conf
Created October 14, 2017 13:07
Show Gist options
  • Save voluntas/bc54c60aaa7ad6856e6f6a928b79ab6c to your computer and use it in GitHub Desktop.
Save voluntas/bc54c60aaa7ad6856e6f6a928b79ab6c to your computer and use it in GitHub Desktop.
Sysctl configuration for high performance
### KERNEL TUNING ###
# Increase size of file handles and inode cache
fs.file-max = 2097152
# Do less swapping
vm.swappiness = 10
vm.dirty_ratio = 60
vm.dirty_background_ratio = 2
# Sets the time before the kernel considers migrating a proccess to another core
kernel.sched_migration_cost_ns = 5000000
# Group tasks by TTY
#kernel.sched_autogroup_enabled = 0
### GENERAL NETWORK SECURITY OPTIONS ###
# Number of times SYNACKs for passive TCP connection.
net.ipv4.tcp_synack_retries = 2
# Allowed local port range
net.ipv4.ip_local_port_range = 2000 65535
# Protect Against TCP Time-Wait
net.ipv4.tcp_rfc1337 = 1
# Control Syncookies
net.ipv4.tcp_syncookies = 1
# Decrease the time default value for tcp_fin_timeout connection
net.ipv4.tcp_fin_timeout = 15
# Decrease the time default value for connections to keep alive
net.ipv4.tcp_keepalive_time = 300
net.ipv4.tcp_keepalive_probes = 5
net.ipv4.tcp_keepalive_intvl = 15
### TUNING NETWORK PERFORMANCE ###
# Default Socket Receive Buffer
net.core.rmem_default = 31457280
# Maximum Socket Receive Buffer
net.core.rmem_max = 33554432
# Default Socket Send Buffer
net.core.wmem_default = 31457280
# Maximum Socket Send Buffer
net.core.wmem_max = 33554432
# Increase number of incoming connections
net.core.somaxconn = 65535
# Increase number of incoming connections backlog
net.core.netdev_max_backlog = 65536
# Increase the maximum amount of option memory buffers
net.core.optmem_max = 25165824
# Increase the maximum total buffer-space allocatable
# This is measured in units of pages (4096 bytes)
net.ipv4.tcp_mem = 786432 1048576 26777216
net.ipv4.udp_mem = 65536 131072 262144
# Increase the read-buffer space allocatable
net.ipv4.tcp_rmem = 8192 87380 33554432
net.ipv4.udp_rmem_min = 16384
# Increase the write-buffer-space allocatable
net.ipv4.tcp_wmem = 8192 65536 33554432
net.ipv4.udp_wmem_min = 16384
# Increase the tcp-time-wait buckets pool size to prevent simple DOS attacks
net.ipv4.tcp_max_tw_buckets = 1440000
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse = 1
@DBezemer
Copy link

To everyone directly copying this config file, the option net.ipv4.tcp_tw_recycle = 1 will very likely cause major issues and strange network behavior in virtualized environments / load balancers / firewalls where the observed behavior will be random stalls up to 15 seconds and during a packet capture packets will simply "disappear"

Refer to https://www.speedguide.net/articles/linux-tweaking-121 to see if this setting is recommended/needed for you as there is no clear performance benefit when using this setting.

@kkm000
Copy link

kkm000 commented Jun 25, 2019

net.core.wmem_default = 31457280

Ugh. Try that with a fast Ethernet card, and see what happens on the receiving end... I think it will kill the receiver's ring buffer at 1 Gbps already. I would not go over 128K on the sending side of a fast link, not at 10G for sure.

net.core.somaxconn = 65535

So, the machine handles 64K clients, and is buffering up to 32MB for each on its output. 2 TB of write buffers. That's a lot of RAM and cores to support this, and a very, very large stack of 50Gbps cards. What is the point of handling this huge workload in a monstrous machine? Probably doable with a non-trivial PCIe topology, and throw in OmniPath interconnects, bit it still seems like a couple of 42U racks of smaller servers could do the same job with a much better redundancy at a lower cost, given that even 50GbE's are uncommon and expensive, and you do not build non-uniform PCIe topologies in your garage. Without an explanation why, this looks... well, quite an unusual configuration.

@seandex
Copy link

seandex commented Sep 30, 2019

To everyone directly copying this config file, the option net.ipv4.tcp_tw_recycle = 1 will very likely cause major issues and strange network behavior in virtualized environments / load balancers / firewalls where the observed behavior will be random stalls up to 15 seconds and during a packet capture packets will simply "disappear"

Refer to https://www.speedguide.net/articles/linux-tweaking-121 to see if this setting is recommended/needed for you as there is no clear performance benefit when using this setting.

tw_recycle is useful in virtualized environments if combination with its program that designed for reuse.

@seandex
Copy link

seandex commented Sep 30, 2019

comments here are helpless but confusing people.

@minhtanle
Copy link

Why don't you provide the system parameters that you have used this configuration for, so it is better

@brandt
Copy link

brandt commented Oct 13, 2019

comments here are helpless but confusing people.

The fact is, tweaking kernel parameters has nuanced effects on your system. Everything has a cost-benefit tradeoff.

Copy-pasting someone else's sysctl settings without understanding the implications will often make one's performance worse -- even if it worked well for the person who posted the config. There isn't a good one-sized-fits-all "performance" config.

All that said, there are a few settings, like net.ipv4.tcp_tw_recycle, where there is good general guidance: It's almost never a good idea to set net.ipv4.tcp_tw_recycle=1.

If behind a NAT device or clients use per-connection timestamp randomization (default in Linux 4.10+ and a few other OSes), you're likely to have problems. The net.ipv4.tcp_tw_recycle option was removed in Linux 4.12.

Unfortunately, a lot of the kernel parameters are not well documented. So fully understanding the implications can be a non-trivial task. If you don't have time for research and experimentation, using tuned is your next best bet.

@kkm000
Copy link

kkm000 commented Oct 14, 2019

@brandt, TIL about tuned, thanks!

@kkm000
Copy link

kkm000 commented Oct 14, 2019

@seandex,

comments here are helpless but confusing people.

You mean unhelpful. I think, taken all together, the comments send a pretty reasonable message: just do not copy this configuration, it does not make sense. Sapienti sat.

@rbrockway
Copy link

Never change kernel parameters on a production system without understanding the consequences and having tested it first.

@pratheekhegde
Copy link

Better not to blindly copy paste this config. It's more advisable to override this configs when stumped upon a related error.

@chrcoluk
Copy link

chrcoluk commented Aug 20, 2020

net.ipv4.tcp_rfc1337 = 1

Actually complies with rfc1337, and does "not" provide the protection, needs to be set to 0. Info is in the source code.

net.ipv4.tcp_tw_recycle =1 is dangerous will break connectivity to NAT users, this sysctl has now been removed in new kernels.

Also the default rmem and wmem are set way too high, a recipe for resource exhaustion.

@JanZerebecki
Copy link

net.ipv4.tcp_rfc1337 = 1

Actually complies with rfc1337, and does "not" provide the protection, needs to be set to 0. Info is in the source code.

No. By my reading of the source code, tcp_rfc1337 = 1 does protect against time-wait assassinations. See https://serverfault.com/questions/787624/why-isnt-net-ipv4-tcp-rfc1337-enabled-by-default#comment1380898_789212

@denizaydin
Copy link

Hi I have couple of questions regarding buffer tunnings.

  • Why there are different values for TCP and UDP for buffers? net.ipv4.tcp_mem = 786432 1048576 26777216
    net.ipv4.udp_mem = 65536 131072 262144

  • net.ipv4.tcp_wmem = 8192 65536 33554432 seams double value for most of the recommodations. How do you end up with those values? What is MTU value that your are considering?

  • Did you consider any offloading cappacity for the NIC like segmentation offloading and e.t.c.

@terrancewong
Copy link

these a typo here, 26777216 should be 16777216 , 16777216 is 1<<24, or exactly 16MiB

@scroot
Copy link

scroot commented Feb 6, 2023

Better not to blindly copy paste this config.

@terrancewong
Copy link

absolutely!
if you search the internet for 26777216, there's a lot of this typo almost all in sysctl.conf. I'm wondering where's the source of this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment