Upgrade: openSUSE Leap 42.1 -> 42.3
- Provision a new instance using the latest image (https://downloads.snapt.net/) and a demo license if necessary
- Verify OS version and architecture of new instance (/etc/os-release,
uname -a
) - Apply baseline server configuration and hardening:
- Change the administrator password (Snapt UI)
- Change shell account passwords (
passwd && sudo passwd
) - Deploy ~/.ssh/authorized_keys for public key authentication (SSH)
- Deploy /etc/ssh/sshd_config to disable password authentication and bind to specific
ListenAddress
- Deploy /etc/lighttpd/lighttpd.conf to disable external access to insecure HTTP on port 8080
- Complete pending items in Status Check in Snapt UI
- Framework Update
- System Patches
- All non-interactive (may require several attempts to install all patches; inspect with
ps aux | grep zypper
) - All interactive (
sudo zypper patch
and reboot)
- All non-interactive (may require several attempts to install all patches; inspect with
- Time Sync
- Install plugins to match master server, including Redundancy V2
- Add new instance as a new slave node to the master server:
- Redundancy Servers
- Local Replication
- On the master server, perform a Force Sync to export configuration to the new slave node (inspect ARP with
sudo tail -f /var/log/messages
) - On the new slave node, ensure all services are running:
- Balancer (HAProxy)
- Accelerator (NGINX)
It is important to note that not all server settings are copied over during a Force Sync. The following areas MUST be reviewed to ensure the new master node has the appropriate configuration:
- Ensure all slave nodes are up-to-date on slave nodes (some plugins and system patches available)
- Ensure all services on master node are properly configured, tested, and reloaded
- Cleanup extra files on slave node file systems (i.e. /etc/nginx/...)
- Perform Force Sync from master node to ensure configuration has replicated
- Reload all services on all slave nodes to ensure configuration has been applied
- Compare and manually resolve any configuration discrepancies between master and slave nodes, including (but not limited to):
- Setup -> Configuration -> Snapt Configuration
- Remember to turn HTTP Access off once HTTPS has been successfully configured to avert adming login using an insecure URL
- Setup -> Configuration -> Email Configuration
- Important: Verify ALL tunings for handling max. concurrent connections have been applied (see ticket #3219) in the following order and reboot if necessary
- /etc/sysctl.conf (reload with
sysctl -p
) - /etc/security/limits.conf
- /proc/sys/net/ipv4/ip_local_port_range
- /etc/sysctl.conf (reload with
- Ensure monitor scripts have been copied
- Setup -> Configuration -> Snapt Configuration
- Take a backup of the master/active node (Utilities -> Snapt Backup)
- Stop Redundancy on any additional slave nodes to facilitate explicit failover to upgraded node and Reload Redundancy
- Note: Redundancy will automatically be restarted on all nodes once Start/Reload has been performed on destination master node
- Perform initial failover to upgraded master node:
- Ping a single VIP continuously (i.e.
ping <VIP> -t
) and run the following simultaenously to monitor VRRP activity and verify traffic switch overtail -f /var/log/messages
/usr/sbin/tcpdump ip proto \\icmp
- Verify service health
- Ping a single VIP continuously (i.e.
- Ensure Redundancy has been stopped on current master and failover to new master is successful
- Change SLAVE01 operation mode to master
- Change MASTER operation mode to slave, mapped to new master
- Be sure to visit Local Replication to obtain Slave Server Key and copy to new master
- Optionally map additional slave nodes to new master
- Reload Redundancy on the new master node. Then Start Redundancy on the former master (now slave) node
- Verify VIPs in Standby mode
- Important: Several anomalies were encountered when attempting to join new slave nodes, including
lighttpd
service unable to bind during startup and configurations not replicating properly. Be sure to review items under Redundancy -> Local Replication, re-enabling configuration items which should be copied in future synchronizations initiated from the new master node before performing a Force Sync.
- As of 5/3/18, new licenses must be downloaded to each node individual in Snapt UI (Dashboard -> License) in order to apply any extensions, etc.
The following assumes use of an internal DNS server used to split traffic to new load balancer during production pilot while an old load balancer was being decommissioned.
Overview:
- Add original VIP(s) to Snapt for services ready to be activated in production, Force Sync, and Reload Redundancy
- Delete service in old load balancer to release IP
- Add second bind directive to listen/frontend groups (HAProxy) referencing original service IP. Retain existing bind addresses to allow time for DNS update
- Update internal DNS for services to use original service IP
- Remove original bind address no longer in use from both HAProxy/NGINX and VIPs (Redundancy)