Skip to content

Instantly share code, notes, and snippets.

@bradfa
Last active January 3, 2017 16:34
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bradfa/fd7e9e14458eb03df47f94f5c874e5df to your computer and use it in GitHub Desktop.
Save bradfa/fd7e9e14458eb03df47f94f5c874e5df to your computer and use it in GitHub Desktop.
NTP Server Debian Jessie Setup Notes

I run one of the few NTP servers in India. You can see my server's monitored health in the NTP pool system, and I have a public profile in the NTP pool system which shows all of my servers (there's currently only 1).

I was inspired to setup an NTP server in India because of an article about NTP on LWN.

These are some notes about what I've done.

Done:

  • Digital Ocean $5/month droplet in BLR1 region running 64 bit Debian Jessie.
  • Setup my DNS server to point the hostname at the IPv4 address.
  • Login via DO web-console (after changing root password) and set SSH to a different port
  • Install some extras: sudo apt-get install ntp ntp-doc apticron vim screen bmon lighttpd
  • Configure ntp to use a few stratum1 and stratum2 servers which are kinda nearby
  • Run bmon in a screen for "real-time" bandwidth tracking
  • Setup lighttpd HTTP redirect for (.*).pool.ntp.org to www.pool.ntp.org. Lighttpd has since been removed.
  • Setup server in the pool DNS system management console
  • Configure apticron to email me every day if there's updates to be applied. This doesn't seem to actually work, I don't get the emails, yet.
  • Enabled unattended upgrades via dpkg-reconfigure -plow unattended-upgrades which will make an apt config file which defaults to updating package lists every day (a bit redundant with apticron maybe?) and installing stable updates.
  • IPv6 address added to the NTP pool, now. But this has difficulty with the amount of traffic.
  • Setup and enabled fail2ban due to botnets probing sshd in interesting ways (even though I run sshd on a non-standard port).

TODO:

  • Setup some long term monitoring which makes pretty graphs, like munin or mrtg
  • Show pretty monitoring graphs through a web interface
  • Fix ipv6 CPU consumption due to route lookups and other ipv6 function calls.

Notes:

  • After you add a server to the NTP pool system, it'll take a few hours (was about 6 in my first try) till the monitoring will consider your server to have a good enough score to be added to the pool.
  • The "Net speed" setting in the NTP pool DNS system doesn't seem to be that effective. Setting my desired bandwidth to 3 Mbps resulted in sustained 4-5 Mbps and peak 7 Mbps traffic while setting my desired bandwidth to 10 resulted in sustained 6-7 Mbps and peak 9 Mbps traffic.
  • Currently I have my "Net speed" set to 25 Mbps and will monitor how that translates into real world traffic levels.
  • Network traffic for an NTP server will be roughly symmetrical.
  • 9 Mbps of up/down traffic is about 30% CPU load on my $5/month droplet. My droplet says it has a Xeon E5-2650Lv3 1.8 GHz CPU.
  • Currently RAM usage is under 100 MB.
  • Digital Ocean has no way to monitor network transfer amounts per month even though they show numbers for each level of droplets ($5/month droplets are listed with 1000 GB transfer per month) but they do not spell out how they measure transfer and they do not track it in a way which is visible to the customer in their control panel. I opened a support ticket to ask if I'd be charged for overages, as I'm on track to transfer a couple TB per month, and their response was that until they show the info in the control panel, they won't charge customers for overages and that they'd be sure to email customers prior to rolling out such a change. Other than causing tons of confusion (google for this topic) by listing the transfer levels for each droplet size, this seems quite reasonable on their part.
  • I see horrible latency to/from my server. Round trip times to anthing outside of India is consistently over 150 ms. From the USA it's over 200 ms, always (which kind of makes sense due to physics).
  • During the day, India Standard Time, I currently see an average of about 10 kpps. Assuming that a client sends 1 packet every 64 seconds, this equates to about 640k clients. At peak times, which are in the evening India Standard Time, this goes up, and from about midnight till 6 am India Standard Time, this goes down. The traffic levels are very consistent day to day.
@bradfa
Copy link
Author

bradfa commented Dec 20, 2016

As per http://lists.ntp.org/pipermail/pool/2016-December/007992.html it is a good idea to disable connection tracking for UDP on an NTP server, since not disabling this will carry a bit of overhead tracking all of the queries.

Making the iptables and ip6tables rules persistent can use Debian's iptables-persistent package. During install, just accept to write out the current rules as it'll create the correct files in the correct locations (the documentation is a bit lacking, but dpkg does the right thing) in /etc/iptables/rules.v{4,6}.

@bradfa
Copy link
Author

bradfa commented Dec 21, 2016

So the connection tracking was not the root of my ipv6 high CPU usage problem. I ran sudo perf record -a -F 1000 sleep 5 to capture 5 seconds of the system performance and found the results summary big hitters to be:

Samples: 4K of event 'cycles', Event count (approx.): 9392666711
  45.65%             ntpd  [kernel.kallsyms]  [k] fib6_walk_continue
  24.89%             ntpd  [kernel.kallsyms]  [k] fib6_age
   9.63%             ntpd  [kernel.kallsyms]  [k] ip6_neigh_lookup
   3.45%             ntpd  [kernel.kallsyms]  [k] fib6_clean_node
   3.39%             ntpd  [kernel.kallsyms]  [k] irq_entries_start
   1.85%             ntpd  [kernel.kallsyms]  [k] __local_bh_enable_ip
   0.47%             ntpd  [kernel.kallsyms]  [k] inet_getpeer
   0.47%             ntpd  [ip6_tables]       [k] ip6t_do_table
   0.44%             ntpd  [kernel.kallsyms]  [k] fib6_lookup_1
   0.32%             ntpd  [kernel.kallsyms]  [k] peer_avl_rebalance.isra.4

So clearly doing ipv6 route lookups is what's killing the CPU, and this is only with about 2k packets per second, total across ipv4 and ipv6.

I'm going to try mucking with a few sysctls to reduce the size of the route cache and modify how often garbage collection takes place. For now, going to try this setup (but it's not permanent, yet):

root@ntp:/proc/sys/net/ipv6/route# for file in `ls`; do echo $file; cat $file; done
flush
cat: flush: Permission denied
gc_elasticity
9
gc_interval
30
gc_min_interval
0
gc_min_interval_ms
256
gc_thresh
16
gc_timeout
60
max_size
16
min_adv_mss
1220
mtu_expires
10

@bradfa
Copy link
Author

bradfa commented Dec 21, 2016

Perf recording for 5 seconds when the server is in the ipv4 pool but not in the ipv6 pool, handling about 12k pps, looks very reasonable like:

Samples: 3K of event 'cycles', Event count (approx.): 3049303461
  11.86%             ntpd  [kernel.kallsyms]  [k] irq_entries_start
   1.83%          swapper  [kernel.kallsyms]  [k] native_write_msr_safe
   1.77%             ntpd  [kernel.kallsyms]  [k] do_select
   1.50%          swapper  [kernel.kallsyms]  [k] irq_entries_start
   1.34%          swapper  [kernel.kallsyms]  [k] pvclock_clocksource_read
   1.22%             ntpd  [kernel.kallsyms]  [k] __fget_light
   1.17%             ntpd  ntpd               [.] 0x000000000001fa63
   1.12%          swapper  [kernel.kallsyms]  [k] ipt_do_table
   1.10%             ntpd  [kernel.kallsyms]  [k] ipt_do_table
   1.10%          swapper  [kernel.kallsyms]  [k] __switch_to
   1.02%          swapper  [kernel.kallsyms]  [k] enqueue_task_fair
   0.97%          swapper  [kernel.kallsyms]  [k] __udp4_lib_lookup
   0.95%          swapper  [kernel.kallsyms]  [k] virtnet_poll
   0.90%             ntpd  [kernel.kallsyms]  [k] iowrite16
   0.90%             ntpd  [kernel.kallsyms]  [k] sock_poll
   0.88%             ntpd  [kernel.kallsyms]  [k] copy_user_enhanced_fast_string
   0.87%             ntpd  ntpd               [.] 0x0000000000050aea

@bradfa
Copy link
Author

bradfa commented Dec 21, 2016

My sysctl adjustments don't seem to be a good thing. They might be a bit too extreme, with them in place I'm regularly seeing the monitoring station fail to get a reading from my ntp server and hence dock my pool score.

I've now also updated to Linux 4.8, as there may be changes in newer kernels which make the ipv6 route lookups faster. Old kernel was 3.16 from Debian Jessie. I found this presentation with some overview of how much slower ipv6 route lookups are than ipv4, which isn't quite my problem, but is somewhat similar: http://www.netdevconf.org/1.1/proceedings/slides/kubecek-ipv6-route-lookup-performance-scaling.pdf

@bradfa
Copy link
Author

bradfa commented Dec 21, 2016

Can any improvement be made by disabling CONFIG_IPV6_MULTIPLE_TABLES in the kernel configuration?

@bradfa
Copy link
Author

bradfa commented Dec 22, 2016

So it seems that Linux 4.8 (or somewhere before this) fixed issues in Linux 3.16 which was causing the ipv6 route cache to churn like crazy. Upgrading to Linux 4.8 has solved my high CPU usage issue, and right now with 30k pps inbound/outbound perf shows:

Samples: 4K of event 'cycles', Event count (approx.): 4991862668
Overhead  Command          Shared Object      Symbol
  25.37%  ntpd             [kernel.kallsyms]  [k] irq_entries_start
   1.13%  swapper          [kernel.kallsyms]  [k] irq_entries_start
   1.06%  ntpd             [kernel.kallsyms]  [k] __raw_callee_save___pv_queued_spin_unlock
   0.94%  swapper          [kernel.kallsyms]  [k] native_write_msr
   0.90%  ntpd             [kernel.kallsyms]  [k] do_select
   0.88%  ntpd             [kernel.kallsyms]  [k] entry_SYSCALL_64
   0.85%  swapper          [kernel.kallsyms]  [k] pvclock_clocksource_read
   0.79%  swapper          [kernel.kallsyms]  [k] fib_table_lookup
   0.78%  ntpd             [kernel.kallsyms]  [k] iowrite16
   0.75%  ntpd             [kernel.kallsyms]  [k] __fget_light
   0.75%  ntpd             [kernel.kallsyms]  [k] pvclock_clocksource_read
   0.74%  ntpd             [kernel.kallsyms]  [k] datagram_poll
   0.74%  ntpd             [kernel.kallsyms]  [k] sock_poll
   0.72%  ntpd             [kernel.kallsyms]  [k] ip6t_do_table
   0.68%  ntpd             [kernel.kallsyms]  [k] __check_object_size
   0.63%  ntpd             [kernel.kallsyms]  [k] fib_table_lookup
   0.62%  ntpd             [kernel.kallsyms]  [k] copy_user_enhanced_fast_string
   0.60%  ntpd             [kernel.kallsyms]  [k] ipt_do_table
   0.60%  ntpd             [kernel.kallsyms]  [k] entry_SYSCALL_64_after_swapgs
   0.56%  ntpd             [kernel.kallsyms]  [k] udpv6_recvmsg
   0.53%  swapper          [kernel.kallsyms]  [k] page_to_skb.isra.29
   0.51%  swapper          [kernel.kallsyms]  [k] virtqueue_get_buf
   0.47%  swapper          [kernel.kallsyms]  [k] __netif_receive_skb_core
   0.45%  swapper          [kernel.kallsyms]  [k] __raw_callee_save___pv_queued_spin_unlock
   0.45%  swapper          [kernel.kallsyms]  [k] ip6t_do_table
   0.44%  ntpd             [kernel.kallsyms]  [k] virtqueue_add_outbuf
   0.43%  ntpd             [kernel.kallsyms]  [k] free_old_xmit_skbs.isra.43
   0.43%  ntpd             [kernel.kallsyms]  [k] ip6_pol_route
   0.42%  swapper          [kernel.kallsyms]  [k] ipt_do_table
   0.42%  swapper          [kernel.kallsyms]  [k] virtnet_receive
   0.40%  ntpd             ntpd               [.] 0x0000000000050aea
   0.40%  ntpd             [kernel.kallsyms]  [k] __virt_addr_valid
   0.40%  ntpd             [kernel.kallsyms]  [k] virtqueue_get_buf
   0.39%  swapper          [kernel.kallsyms]  [k] dev_gro_receive
   0.39%  ntpd             [kernel.kallsyms]  [k] dev_gro_receive
   0.38%  ntpd             [kernel.kallsyms]  [k] system_call_fast_compare_end
   0.38%  ntpd             [kernel.kallsyms]  [k] _raw_spin_lock_irqsave
   0.36%  ntpd             [kernel.kallsyms]  [k] sock_def_write_space
   0.36%  ntpd             ntpd               [.] 0x000000000001fa63

Which is awesome! :)

@bradfa
Copy link
Author

bradfa commented Dec 30, 2016

It seems that only the DNS entries starting with '2.*' return AAAA (IPv6) results in my testing from home no matter what region I query.
I hypothesize that the peak traffic times whee I see short term reductions in inbound traffic are because other IPv6 servers in India come into the pool for a short bit before their scores drop and they get kicked out. To test this I may spin up another ntp server that can take the load and see if that helps to stabilize the India pool at all.

@bradfa
Copy link
Author

bradfa commented Jan 3, 2017

I should be able to run perf and tcpdump in parallel for a timed duration as root like (example is 10 seconds):

perf record -a -F 1000 tcpdump -i eth0 -w /home/andrew/tcpdump.pcap udp and port 123 & sleep 10; kill %1

@bradfa
Copy link
Author

bradfa commented Jan 3, 2017

The other two ipv6 capable servers I see in the India zone are:
http://www.pool.ntp.org/scores/2401:dc00:100:136::34
http://www.pool.ntp.org/scores/2404:e800:3:300:218:186:3:36

Neither can sustain the load of peak hours.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment