This script handles failover of a primary WAN interface on a Mikrotik router and attemnpts to ping an 'always available' internet address such as 8.8.4.4.
The script requires the device have 2 x WAN interfaces each with a default gateway defined (see below).
The primary WAN interfaces' default route requires a comment that is used in the script to set a route distance upon failover. In this example the primary WAN comment is "primary_route". This route also requires a distance of 1 be set.
The secondary WAN interface must have a distance of 2 set, the comment here is not important but in my example this route comment is "secondary_route".
The routes:
/ip route
add comment=primary_route disabled=no distance=1 dst-address=0.0.0.0/0 gateway=124.1.1.2 routing-table=main scope=30 suppress-hw-offload=no target-scope=10
add comment=secondary_route disabled=no distance=2 dst-address=0.0.0.0/0 gateway=10.31.0.2 routing-table=main scope=30 suppress-hw-offload=no target-scope=10
The script must always attempt to ping the internet endpoint via the primary ineterface gateway so another 2 static routes are set. The blackhole route is essential in the case that the primary WAN gateway is the cause of the failover as it prevents the device sending ping requests to 8.8.4.4 via the secondary WAN gateway which would result in the WAN gateways flapping up and down.
/ip route
add disabled=no distance=1 dst-address=8.8.4.4/32 gateway=124.1.1.2 routing-table=main scope=30 suppress-hw-offload=no target-scope=10
add blackhole disabled=no distance=2 dst-address=8.8.4.4/32 gateway="" routing-table=main scope=30 suppress-hw-offload=no target-scope=10
There are multpile variables that can be set at the begining of the script that affect the operation of the failover:
Option | Default | Description |
---|---|---|
debug | false | Adds additional :log output. |
pingTarget | "8.8.4.4" | IP endpoint to ping, should be an 'always on' and reliable address. |
routeComment | "primary_route" | must match the Primary WAN defautl gateway comment. This is used to set the route distance on state change. |
pingCount | 15 | Number if ping requests sent at each scheduled check cycle |
successPercent | 85 | Integer value representing the required percentage of success pings responses required per check cycle. ie. 85 = 85% required to mark as successful. |
maxAllowedPing | 200 | Maximum allowed time in ms allowed for each ping response. If this time is exceeded the ping response is not included in that cycles' success count and will affect successPercent. |
failThreshold | 2 | Provides debounce for the DOWN state. ie. 2 x successive ping cycles must fail before Primary WAN is marked DOWN. Set to 1 to disable debounce. |
successThreshold | 3 | Provides debounce for the UP state. ie. 3 x successive ping cycles must succeed before Primary WAN is marked UP. Set to 1 to disable debounce. |
❗ The script has the route distance set commands commented to avoid any issues during testing. To activate the script the two commented lines below must be uncommented.
#set $r distance=3
and
#set $r distance=1
RouterOS does not do floating point math so we have to improvise. The script calculates a percentage of sent pings that must succeed. eg. 15 pings are sent and 85% must pass by default, ie. 12.75. Rounded up this is 13 ping responses required.
From the RouterOS CLI:
:put ((15*85)/100)
12
But 85% of 15 rounded up is 13, so:
:put (((15*85)+99)/100)
13
Adding 99 to the multiplication shifts the result just above the next integer boundary forcing the round up. Its a little hacky but until we get floating point math in RouterOS it will do.