Skip to content

Instantly share code, notes, and snippets.

@ilium007
Last active April 26, 2025 15:16
Show Gist options
  • Save ilium007/5cbe63ce9a148746a7842c1dc55bb967 to your computer and use it in GitHub Desktop.
Save ilium007/5cbe63ce9a148746a7842c1dc55bb967 to your computer and use it in GitHub Desktop.
RouterOS dual WAN failover script

Mikrotik RouterOS Dual WAN Failover Script

This script handles failover of a primary WAN interface on a Mikrotik router and attemnpts to ping an 'always available' internet address such as 8.8.4.4.

The script requires the device have 2 x WAN interfaces each with a default gateway defined (see below).

The primary WAN interfaces' default route requires a comment that is used in the script to set a route distance upon failover. In this example the primary WAN comment is "primary_route". This route also requires a distance of 1 be set.

The secondary WAN interface must have a distance of 2 set, the comment here is not important but in my example this route comment is "secondary_route".

The routes:

/ip route
add comment=primary_route disabled=no distance=1 dst-address=0.0.0.0/0 gateway=124.1.1.2 routing-table=main scope=30 suppress-hw-offload=no target-scope=10
add comment=secondary_route disabled=no distance=2 dst-address=0.0.0.0/0 gateway=10.31.0.2 routing-table=main scope=30 suppress-hw-offload=no target-scope=10

The script must always attempt to ping the internet endpoint via the primary ineterface gateway so another 2 static routes are set. The blackhole route is essential in the case that the primary WAN gateway is the cause of the failover as it prevents the device sending ping requests to 8.8.4.4 via the secondary WAN gateway which would result in the WAN gateways flapping up and down.

/ip route
add disabled=no distance=1 dst-address=8.8.4.4/32 gateway=124.1.1.2 routing-table=main scope=30 suppress-hw-offload=no target-scope=10
add blackhole disabled=no distance=2 dst-address=8.8.4.4/32 gateway="" routing-table=main scope=30 suppress-hw-offload=no target-scope=10

There are multpile variables that can be set at the begining of the script that affect the operation of the failover:

Option Default Description
debug false Adds additional :log output.
pingTarget "8.8.4.4" IP endpoint to ping, should be an 'always on' and reliable address.
routeComment "primary_route" must match the Primary WAN defautl gateway comment. This is used to set the route distance on state change.
pingCount 15 Number if ping requests sent at each scheduled check cycle
successPercent 85 Integer value representing the required percentage of success pings responses required per check cycle. ie. 85 = 85% required to mark as successful.
maxAllowedPing 200 Maximum allowed time in ms allowed for each ping response. If this time is exceeded the ping response is not included in that cycles' success count and will affect successPercent.
failThreshold 2 Provides debounce for the DOWN state. ie. 2 x successive ping cycles must fail before Primary WAN is marked DOWN. Set to 1 to disable debounce.
successThreshold 3 Provides debounce for the UP state. ie. 3 x successive ping cycles must succeed before Primary WAN is marked UP. Set to 1 to disable debounce.

❗ The script has the route distance set commands commented to avoid any issues during testing. To activate the script the two commented lines below must be uncommented.

#set $r distance=3

and

#set $r distance=1

Scaled Integer Math

RouterOS does not do floating point math so we have to improvise. The script calculates a percentage of sent pings that must succeed. eg. 15 pings are sent and 85% must pass by default, ie. 12.75. Rounded up this is 13 ping responses required.

From the RouterOS CLI:

:put ((15*85)/100)
12

But 85% of 15 rounded up is 13, so:

:put (((15*85)+99)/100)
13

Adding 99 to the multiplication shifts the result just above the next integer boundary forcing the round up. Its a little hacky but until we get floating point math in RouterOS it will do.

# Enable or disable debug logging
:local debug false
# Ping target and primary WAN route comment to target MUST match primary WAN route comment
:local pingTarget "8.8.4.4"
:local routeComment "primary_route"
# Number of pings per check cycle
:local pingCount 15
# Required success percentage (integer req. here, 85 = 85%)
:local successPercent 85
# Max allowed ping time in milliseconds
:local maxAllowedPing 200
# Debounce thresholds
:local failThreshold 2
:local successThreshold 3
#####################################################
# DO NOT MODIFY BELOW HERE
#####################################################
# Global vars to preserve WAN state between checks
:global wanState
:global wanFailCount
:global wanSuccessCount
# Initialize global state if missing
:if ([:typeof $wanState] = "nothing") do={ :set wanState "UP" }
:if ([:typeof $wanFailCount] = "nothing") do={ :set wanFailCount 0 }
:if ([:typeof $wanSuccessCount] = "nothing") do={ :set wanSuccessCount 0 }
# Initialize counters
:local success 0
:local totalPings 0
# Calculate minimum successes needed (RouterOS has no floats so use integer / scaled math here)
# + 99 before dividing by 100 ensures "round up" behavior (since RouterOS only does integer division)
:local minSuccess (($pingCount * $successPercent + 99) / 100)
# Send $pingCount pings
:foreach r in=[/ping $pingTarget count=$pingCount interval=0.3s size=32 as-value] do={
:set totalPings ($totalPings + 1)
:if ([:typeof ($r->"time")] != "nothing") do={
:local rawTime ($r->"time")
:local seconds [:tonum [:pick $rawTime 6 8]]
:local microseconds [:tonum [:pick $rawTime 9 15]]
:local latency (($seconds * 1000) + ($microseconds / 1000))
:if ($latency > 0 && $latency <= $maxAllowedPing) do={
:set success ($success + 1)
:if ($debug) do={
:log info ("Failover Check: Ping seq=$($r->"seq") latency=$latency ms - ALLOW")
}
} else={
:if ($debug) do={
:log warning ("Failover Check: Ping seq=$($r->"seq") latency=$latency ms EXCEEDS threshold ($maxAllowedPing ms) - DISCARD")
}
}
} else={
:if ($debug) do={
:log warning ("Failover Check: No reply for ping seq=$($r->"seq")")
}
}
}
# Log summary of ping check
:local successPercentResult (($success * 100) / $pingCount)
:log info ("Failover Check: PRIMARY WAN $wanState $success/$pingCount successful pings ($successPercentResult%) ")
# Determine what should happen based on number of successful pings
/ip route
:foreach r in=[find comment=$routeComment] do={
:if ($wanState = "UP") do={
:if ($success < $minSuccess) do={
:set wanFailCount ($wanFailCount + 1)
:set wanSuccessCount 0
:if ($wanFailCount >= $failThreshold) do={
#set $r distance=3
:log warning ("Failover Check: PRIMARY WAN DOWN after $wanFailCount consecutive failures. Set distance=3")
:set wanState "DOWN"
:set wanFailCount 0
:set wanSuccessCount 0
} else={
:log warning ("Failover Check: DOWN debounce in progress - $wanFailCount/$failThreshold failures")
}
} else={
:set wanFailCount 0
:set wanSuccessCount 0
}
} else={
:if ($wanState = "DOWN") do={
:if ($success >= $minSuccess) do={
:set wanSuccessCount ($wanSuccessCount + 1)
:set wanFailCount 0
:if ($wanSuccessCount >= $successThreshold) do={
#set $r distance=1
:log info ("Failover Check: PRIMARY WAN UP after $wanSuccessCount consecutive successes. Set distance=1")
:set wanState "UP"
:set wanFailCount 0
:set wanSuccessCount 0
} else={
:log info ("Failover Check: UP debounce in progress - $wanSuccessCount/$successThreshold successes")
}
} else={
:set wanFailCount 0
:set wanSuccessCount 0
}
}
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment