Skip to content

Instantly share code, notes, and snippets.

@sergey-dryabzhinsky
Last active March 3, 2024 13:37
Show Gist options
  • Star 76 You must be signed in to star a gist
  • Fork 29 You must be signed in to fork a gist
  • Save sergey-dryabzhinsky/bcc1a15cb7d06f3d4606823fcc834824 to your computer and use it in GitHub Desktop.
Save sergey-dryabzhinsky/bcc1a15cb7d06f3d4606823fcc834824 to your computer and use it in GitHub Desktop.
Most popular speedup sysctl options for Proxmox. Put in /etc/sysctl.d/
###
# Proxmox or other server kernel params cheap tune and secure.
# Try it if you have heavy load on server - network or memory / disk.
# No harm assumed but keep your eyes open.
#
# @updated: 2020-02-06 - more params used, adjust some params values, more comments on params
#
### NETWORK ###
# Timeout broken connections faster (amount of time to wait for FIN)
net.ipv4.tcp_fin_timeout = 10
# Wait a maximum of 5 * 2 = 10 seconds in the TIME_WAIT state after a FIN, to handle
# any remaining packets in the network.
# load module nf_contrack if needed
net.netfilter.nf_conntrack_tcp_timeout_fin_wait = 5
# Keepalive optimizations
# By default, the keepalive routines wait for two hours (7200 secs) before sending the first keepalive probe,
# and then resend it every 75 seconds. If no ACK response is received for 9 consecutive times, the connection is marked as broken.
# The default values are: tcp_keepalive_time = 7200, tcp_keepalive_intvl = 75, tcp_keepalive_probes = 9
# We would decrease the default values for tcp_keepalive_* params as follow:
# Disconnect dead TCP connections after 10 minutes
net.ipv4.tcp_keepalive_time = 600
# Determines the wait time between isAlive interval probes (reduce from 75 sec to 15)
net.ipv4.tcp_keepalive_intvl = 15
# Determines the number of probes before timing out (reduce from 9 sec to 5 sec)
net.ipv4.tcp_keepalive_probes = 5
# allow that much active connections
net.core.somaxconn = 256000
# Protection from SYN flood attack.
net.ipv4.tcp_syncookies = 1
# Only retry creating TCP connections twice
# Minimize the time it takes for a connection attempt to fail
net.ipv4.tcp_syn_retries = 2
net.ipv4.tcp_synack_retries = 2
net.ipv4.tcp_orphan_retries = 2
# Handle SYN floods and large numbers of valid HTTPS connections
net.ipv4.tcp_max_syn_backlog = 40000
# Increase the length of the network device input queue
net.core.netdev_max_backlog = 50000
# Faster full-speed than cubic
# And faster recover if connection looses packets
net.ipv4.tcp_congestion_control = yeah
# http://lwn.net/Articles/616241/
net.core.default_qdisc = fq_codel
# Increase ephermeral IP ports
net.ipv4.ip_local_port_range = 10000 60000
# Broken combined
net.ipv4.tcp_tw_reuse = 0
# The net.ipv4.tcp_tw_recycle has been removed from Linux 4.12 on 2017.
# Removed by upstream kernel, absent since PVE 5.1.
# So comment it out if you use PVE 5.1+
# Let's mark it - PVE3
net.ipv4.tcp_tw_recycle = 0
# Don't need IPv6 for now
# If you use IPv6 - comment this line
net.ipv6.conf.all.disable_ipv6 = 1
# https://www.serveradminblog.com/2011/02/neighbour-table-overflow-sysctl-conf-tunning/
net.ipv4.neigh.default.gc_thresh1 = 1024
net.ipv4.neigh.default.gc_thresh2 = 2048
net.ipv4.neigh.default.gc_thresh3 = 4096
# http://www.opennet.ru/opennews/art.shtml?num=44945
net.ipv4.tcp_challenge_ack_limit = 9999
# Don't slow network - save congestion window after idle
# https://github.com/ton31337/tools/wiki/tcp_slow_start_after_idle---tcp_no_metrics_save-performance
net.ipv4.tcp_slow_start_after_idle = 0
# If we must send packets at first place, but throughput is on second
# Or many small packets.
net.ipv4.tcp_low_latency = 1
#### PVE ####
# Allow a high number of timewait sockets
net.ipv4.tcp_max_tw_buckets = 2000000
# PVE 3
net.ipv4.tcp_max_tw_buckets_ub = 65000
# Increase Linux autotuning TCP buffer limits
net.ipv4.tcp_wmem = 4096 65536 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.core.optmem_max = 65536
# If your servers talk UDP, also up these limits
net.ipv4.udp_rmem_min = 8192
net.ipv4.udp_wmem_min = 8192
# Sockets/UDP query length
net.unix.max_dgram_qlen = 1024
# http://vds-admin.ru/unix-linux/oshibki-v-dmesg-vida-nfconntrack-table-full-dropping-packet
# load module nf_contrack if needed
net.netfilter.nf_conntrack_max = 1048576
net.nf_conntrack_max = 1048576
### MEMORY ###
# do less swap but not disable it
vm.swappiness = 2
# allow application request allocation of virtual memory
# more than real RAM size (or OpenVZ/LXC limits)
vm.overcommit_memory = 1
# https://major.io/2008/12/03/reducing-inode-and-dentry-caches-to-keep-oom-killer-at-bay/
vm.vfs_cache_pressure = 500
# time in centi-sec. i.e. 100 points = 1 second
# delayed write of dirty data
vm.dirty_writeback_centisecs = 3000
# flush from memory old dirty data
vm.dirty_expire_centisecs = 18000
##
# Adjust vfs cache
# https://lonesysadmin.net/2013/12/22/better-linux-disk-caching-performance-vm-dirty_ratio/
# Decriase dirty cache to faster flush on disk
vm.dirty_background_ratio = 5
vm.dirty_ratio = 10
#### PVE 3 ####
# Only on Proxmox 3.x with OpenVZ
ubc.dirty_ratio = 20
ubc.dirty_background_ratio = 10
# Isolate page cache for VPS.
ubc.pagecache_isolation = 1
### FileSystem ###
##
# Fix: Failed to allocate directory watch: Too many open files
# in Proxmox 5 + LXC
# And VM with Bitrix
# == alot of files
fs.inotify.max_user_instances = 16777216
fs.inotify.max_queued_events = 32000
fs.inotify.max_user_watches = 64000
### Security ###
# http://www.opennet.ru/opennews/art.shtml?num=47792
kernel.unprivileged_bpf_disabled=1
# http://www.opennet.ru/opennews/art.shtml?num=49135
# https://community.rti.com/kb/how-can-i-improve-my-throughput-performance-linux
net.ipv4.ipfrag_high_thresh=8388608
net.ipv4.ipfrag_low_thresh=196608
net.ipv6.ip6frag_high_thresh=8388608
net.ipv6.ip6frag_low_thresh=196608
# http://www.opennet.ru/opennews/art.shtml?num=50889
net.ipv4.tcp_sack = 0
net.ipv4.tcp_mtu_probing = 0
# Prevent TIME_WAIT attak.
net.ipv4.tcp_rfc1337 = 1
### OTHER ###
### PVE 3 - 6 kernels ###
# https://tweaked.io/guide/kernel/
# Don't migrate processes between CPU cores too often
# kernel <= 5.4 (ie Proxmox 6)
kernel.sched_migration_cost_ns = 5000000
# Kernel >= 2.6.38 (ie Proxmox 4+)
kernel.sched_autogroup_enabled = 0
@mantaalex
Copy link

anyone has a full version for PVE7 ?

@sourcecodemage
Copy link

anyone has a full version for PVE7 ?

bump

@sergey-dryabzhinsky
Copy link
Author

Updated line with net.ipv4.tcp_tw_recycle - it's absent since PVE6 (5.4+).

Never saw an error for line net.ipv4.ipfrag_high_thresh on my PVE7.
@MDE186 are you running some specific kernel version?

@sergey-dryabzhinsky
Copy link
Author

This option actualy removed from kernel since 4.12.
So, PVE 5.1 is affected already.

@jmaks
Copy link

jmaks commented Sep 16, 2022

Got pve 7x version. Thereis the backlog of errors after temporary apply sysctl options.

...
sysctl: cannot stat /proc/sys/net/ipv4/tcp_tw_recycle: No such file or directory
...
sysctl: cannot stat /proc/sys/net/ipv4/tcp_max_tw_buckets_ub: No such file or directory
...
sysctl: cannot stat /proc/sys/ubc/dirty_ratio: No such file or directory
sysctl: cannot stat /proc/sys/ubc/dirty_background_ratio: No such file or directory
sysctl: cannot stat /proc/sys/ubc/pagecache_isolation: No such file or directory
...
sysctl: cannot stat /proc/sys/kernel/sched_migration_cost_ns: No such file or directory
...

My pve version used

# pveversion
pve-manager/7.1-10/6ddebafe (running kernel: 5.13.19-6-pve)

Those parameters with ipfrag and errors, early posted by @MDE186 need to be written in right file syntax*

Couldn't write '262144' to 'net/ipv4/ipfrag_high_thresh': Invalid argument
Couldn't write '262144' to 'net/ipv6/ip6frag_high_thresh': Invalid argument

After some parameter name need be spaced and after equal sign too

- net.ipv4.ipfrag_low_thresh=196608
+ net.ipv4.ipfrag_low_thresh  = 196608
- net.ipv6.ip6frag_high_thresh=196608
+ net.ipv6.ip6frag_high_thresh = 196608

@jmaks
Copy link

jmaks commented Sep 16, 2022

Checked again and cleared that all my used by default parameters from file is already deprecated in fresh kernel.

Just only that error is still unrderstood @sergey-dryabzhinsky - can u clear with it?

sysctl: cannot stat /proc/sys/kernel/sched_migration_cost_ns: No such file or directory

@hunterpl
Copy link

Please add this variables for PVE7+

net.ipv4.ipfrag_high_thresh=8388608
net.ipv4.ipfrag_low_thresh=196608
net.ipv6.ip6frag_high_thresh=8388608
net.ipv6.ip6frag_low_thresh=196608

working great. have a good day guys.

@sergey-dryabzhinsky
Copy link
Author

Please add this variables for PVE7+

net.ipv4.ipfrag_high_thresh=8388608 net.ipv4.ipfrag_low_thresh=196608 net.ipv6.ip6frag_high_thresh=8388608 net.ipv6.ip6frag_low_thresh=196608

working great. have a good day guys.

Why so high high_tresh?

@sergey-dryabzhinsky
Copy link
Author

It's like param was renamed/removed since 5.4 kernel.
So it should be commented out for PVE7

Checked again and cleared that all my used by default parameters from file is already deprecated in fresh kernel.

Just only that error is still unrderstood @sergey-dryabzhinsky - can u clear with it?

sysctl: cannot stat /proc/sys/kernel/sched_migration_cost_ns: No such file or directory

@cdorabiatto
Copy link

Hello, thanks for the adjustments above.

But I have doubts about these flags, could you please explain about?

vm.vfs_cache_pressure = 500
vm.dirty_writeback_centisecs = 3000
vm.dirty_expire_centisecs = 18000

can we have slowness in the system?

@sergey-dryabzhinsky
Copy link
Author

vm.vfs_cache_pressure controls how many % of memory used for cache of inodes. So it sets up as many as possible.
vm.dirty_expire_centisecs controls in 10 secs how long data stays in memory before it will be written on disk.
vm.dirty_writeback_centisecs controls in 10 secs how often background checks will find out which data is expired.

More here: https://www.baeldung.com/linux/file-system-caching

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment