Skip to content

Instantly share code, notes, and snippets.

@jprenken
Created February 7, 2022 12:17
Show Gist options
  • Star 7 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save jprenken/18ca7bf14ddae547ae0fdf6f56d72573 to your computer and use it in GitHub Desktop.
Save jprenken/18ca7bf14ddae547ae0fdf6f56d72573 to your computer and use it in GitHub Desktop.
OPNsense: Start/stop WireGuard based on CARP state change (place in /usr/local/etc/rc.syshook.d/carp/)
#!/usr/local/bin/php
<?php
/*
* Copyright (C) 2004 Scott Ullrich <sullrich@gmail.com>
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
*
* 1. Redistributions of source code must retain the above copyright notice,
* this list of conditions and the following disclaimer.
*
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES,
* INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY
* AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
* AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY,
* OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*/
require_once("config.inc");
require_once("interfaces.inc");
require_once("util.inc");
$subsystem = !empty($argv[1]) ? $argv[1] : '';
$type = !empty($argv[2]) ? $argv[2] : '';
if ($type != 'MASTER' && $type != 'BACKUP') {
log_error("Carp '$type' event unknown from source '{$subsystem}'");
exit(1);
}
if (!strstr($subsystem, '@')) {
log_error("Carp '$type' event triggered from wrong source '{$subsystem}'");
exit(1);
}
if ($type === "MASTER") {
log_error("Enabling WireGuard due to CARP event '$type'");
# Checking `isset` avoids a race condition during startup when the
# WireGuard config stanza seems like it's not yet loaded. Without it, this
# can create an extra, empty, invalid stanza that breaks WireGuard.
if (isset($config['OPNsense']['wireguard']['general']['enabled'])) {
$config['OPNsense']['wireguard']['general']['enabled'] = '1';
}
configd_run('wireguard start');
write_config("Enable WireGuard due to CARP event '$type'", false);
} else {
log_error("Disabling WireGuard due to CARP event '$type'");
configd_run('wireguard stop');
if (isset($config['OPNsense']['wireguard']['general']['enabled'])) {
$config['OPNsense']['wireguard']['general']['enabled'] = '0';
}
write_config("Disable WireGuard due to CARP event '$type'", false);
}
@taxilian
Copy link

taxilian commented Oct 6, 2022

Any possible solutions there?

The only solution I've come up with is the one I linked a few comments up: https://gist.github.com/jprenken/18ca7bf14ddae547ae0fdf6f56d72573?permalink_comment_id=4309559#gistcomment-4309559

I use a cron job which runs every minute. It seems to solve the problem for me.

@Hobby-Student
Copy link

just out of curiosity:
if this script is enabling wireguard after MASTER / BACKUP change, but one needs to click on "apply" in the GUI to get wireguard running and the other script running as cron every minute is working "out of the box"... has someone tried to add this simple line configd_run('wireguard start'); twice? perhaps the cron job is "applying" the config on the second run.

@alfrisch
Copy link

has someone tried to add this simple line configd_run('wireguard start'); twice? perhaps the cron job is "applying" the config on the second run.

I tried your suggestion and it works when I add a sleep(1); between the two calls of configd_run('wireguard start');! I guess that OPNsense needs some time to update things on the backend before it can actually activate the tunnels.

@jprenken do you have more insights why the sleep is needed here?

@jprenken
Copy link
Author

Unfortunately, no; I don't understand much about OPNsense internals. I would love for this to get properly upstreamed into the project, but don't have the time to get up to speed and propose it.

@raspitoaster
Copy link

raspitoaster commented Dec 30, 2022

I tried both approaches - the CARP hook and the CRON job - but both did not work.
The handover of the WG IP and the start/stop of the WG service seemed to work but the backup firewall did not take over the peers.
When switching back, the master firewall started the WG service and immediately connected with the peers.
After some packet and log analysis I found out that I had a combination of two problems:

  1. the stupid MAC/name/IP assignment of the Fritz!Box that resisted to communicate with the virtual IP of the WAN interface. The solution was to add a new client to the Fritz!Box network configuration with a fresh (not yet used) IP address and then configure the port forwarding.
  2. the sleep(1) from the comment of @Hobby-Student really seems to be necessary. With the 1 sec delay and the second configd run, the peers perfectly switch from master to backup and back.

The CRON job from @taxilian may also work, but the CARP hook is my favorite.

@nzkiwi68
Copy link

nzkiwi68 commented Jan 17, 2023

Good morning jprenken. I have made a fork of your code and made change which in my testing have made the HA CARP fail-over for WireGuard more reliable.

I invite you to inspect my fork and if you agree add these changes to your script.
https://gist.github.com/nzkiwi68/5b54aece233ff72ada395b5a1bdad92c

I would have done a formal pull request, but it seems although GitHub allows a Gist to be forked, I cannot make a pull request against a Gist, only a conventional GitHub repository.

@tars21
Copy link

tars21 commented Jan 19, 2023

Good morning! Your variant, nzkiwi68, unfortunately did not work for us. I have not analyzed it further, but included the "sleep" part in this gist here. We have many interfaces, one second was obviously not enough here either
/usr/local/etc/rc.bootup: Unable to configure nonexistent interface opt18 (wg0)
but at 3 seconds the switch seems to run reliably so far!
Thanks for that.

@nzkiwi68
Copy link

I really have come to the conclusion the answer lies not in the CARP syshook and running the start multiple times with the sleep statement, but, debugging and fixing the actual wireguard start command that actually gets run.

See my new forum post and franco's reply:
https://forum.opnsense.org/index.php?topic=31962.0

Franco talks about:

In a nutshell it just calls

# /usr/local/etc/rc.d/wireguard start

and does whatever the RC system deems appropriate. No clue what's wrong in your cause, but I do know WireGuard doesn't make itself any easier to debug experimental or not.

@Hobby-Student
Copy link

Hobby-Student commented Jan 20, 2023

glad to see, that the sleep and calling configd_run twice seems to help. It seems, that there is a race condition between enabling wireguard, getting things ready (opnsense site) and starting wireguard. I'm not familiar with the opnsense internas and what the $config['OPNsense']['wireguard']['general']['enabled'] = '1'; is starting. If this is just writing "1" to the config.xml nothing is loaded. If then configd_run('wireguard start'); is running, it takes the change of "wireguard enabled" in the config.xml and informs the actual running system. While the system is doing this change, it can't start wireguard, because this first attempt relies on "wireguard disabled". The sleep gives the system (depending on the used hardware) enough time to end all calls on the system and the second configd_run('wireguard start'); finally starts wireguard, because wireguard is enabled in the config.xml.

I don't know when I can take a deeper look, but I would think of:

-> $config['OPNsense']['wireguard']['general']['enabled'] = '1';
-> "reload settings" of opnsense and wait for completion
-> configd_run('wireguard start');

@Bubbgump209
Copy link

If anyone is interested, I put in this PR opnsense/plugins#3299. I think it checks all the boxes folks are talking about here. Worst case it is rejected, but feel free to test.

@nzkiwi68
Copy link

nzkiwi68 commented Feb 8, 2023 via email

@Bubbgump209
Copy link

Derp, good catch! Fixed.

@kub3let
Copy link

kub3let commented Sep 7, 2023

It seems the script broke with opnsense/plugins@86c9e5c

The configd run now requires to give the wireguard instance as parameter.

So if you upgrade to 23.7.3 it breaks.

The following works:

$servers = (new \OPNsense\Wireguard\Server())->servers->server->iterateItems();

foreach ($servers as $key => $node) {
    if (!empty((string)$node->enabled)) {
        $backend->configdRun("wireguard start {$key}");
    }
}

// repeat for stop

@nzkiwi68
Copy link

This is now unnecessary as proper CARP support is now built into OPNsense with WireGuard since OPNsense 23.7.8 released 09 Nov 2023 and further improved in the latest OPNsense firmware.

The WireGuard follow CARP implementation by the OPNsense dev team is excellent and it works really well!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment