Skip to content

Instantly share code, notes, and snippets.

@willjasen
Last active June 25, 2024 12:46
Show Gist options
  • Save willjasen/df71ca4ec635211d83cdc18fe7f658ca to your computer and use it in GitHub Desktop.
Save willjasen/df71ca4ec635211d83cdc18fe7f658ca to your computer and use it in GitHub Desktop.
Create a Proxmox cluster that communicates over Tailscale

‼️ DANGER ‼️

In the interest of complete transparency, if you follow this guide, there’s a very minuscule but non-zero chance that you may violate the Bekenstein bound, at which the resulting black hole may swallow the earth whole. You have been warned!


⚠️ WARNING ⚠️

  • This guide is for development, testing, and research purposes only. This guide comes with no guarantee or warranty that these steps will work within your environment. Should you attempt within a production environment, any negative outcomes are not the fault of this guide or its author.
  • This guide was tested on Proxmox 8 / Debian 12.

📝 Prologue 📝

  • This example uses "host1" and "host2" as example names for the hosts
  • This example uses "example-test.ts.net" as a Tailscale MagicDNS domain
  • The Tailscale IP for host1 is 100.64.1.1
  • The Tailscale IP for host2 is 100.64.2.2

📋 Steps 📋

  1. Setup two Proxmox hosts

  2. Install Tailscale on the hosts: curl -fsSL https://tailscale.com/install.sh | sh;

  3. Update /etc/hosts on all hosts with the proper host entries:

    • 100.64.1.1 host1.example-test.ts.net host1
    • 100.64.2.2 host2.example-test.ts.net host2
  4. Since DNS queries will be served via Tailscale, ensure that your global DNS server via Tailscale can resolve host1 as 100.64.1.1 and host2 as 100.64.2.2

  5. If you need to allow for the traffic within your Tailscale ACL, allow TCP 22, TCP 8006, and UDP 5405 - 5412; example as follows:

    {"action": "accept", "proto": "tcp", "src": ["host1", "host2"], "dst": ["host1:22"]},   // SSH
    {"action": "accept", "proto": "tcp", "src": ["host1", "host2"], "dst": ["host2:22"]},   // SSH
    {"action": "accept", "proto": "tcp", "src": ["host1", "host2"], "dst": ["host1:8006"]}, // Proxmox web
    {"action": "accept", "proto": "tcp", "src": ["host1", "host2"], "dst": ["host2:8006"]}, // Proxmox web
    {"action": "accept", "proto": "udp", "src": ["host1", "host2"], "dst": ["host1:5405"]}, // corosync
    {"action": "accept", "proto": "udp", "src": ["host1", "host2"], "dst": ["host1:5406"]}, // corosync
    {"action": "accept", "proto": "udp", "src": ["host1", "host2"], "dst": ["host1:5407"]}, // corosync
    {"action": "accept", "proto": "udp", "src": ["host1", "host2"], "dst": ["host1:5408"]}, // corosync
    {"action": "accept", "proto": "udp", "src": ["host1", "host2"], "dst": ["host1:5409"]}, // corosync
    {"action": "accept", "proto": "udp", "src": ["host1", "host2"], "dst": ["host1:5410"]}, // corosync
    {"action": "accept", "proto": "udp", "src": ["host1", "host2"], "dst": ["host1:5411"]}, // corosync
    {"action": "accept", "proto": "udp", "src": ["host1", "host2"], "dst": ["host1:5412"]}, // corosync
    {"action": "accept", "proto": "udp", "src": ["host1", "host2"], "dst": ["host2:5405"]}, // corosync
    {"action": "accept", "proto": "udp", "src": ["host1", "host2"], "dst": ["host2:5406"]}, // corosync
    {"action": "accept", "proto": "udp", "src": ["host1", "host2"], "dst": ["host2:5407"]}, // corosync
    {"action": "accept", "proto": "udp", "src": ["host1", "host2"], "dst": ["host2:5408"]}, // corosync
    {"action": "accept", "proto": "udp", "src": ["host1", "host2"], "dst": ["host2:5409"]}, // corosync
    {"action": "accept", "proto": "udp", "src": ["host1", "host2"], "dst": ["host2:5410"]}, // corosync
    {"action": "accept", "proto": "udp", "src": ["host1", "host2"], "dst": ["host2:5411"]}, // corosync
    {"action": "accept", "proto": "udp", "src": ["host1", "host2"], "dst": ["host2:5412"]}, // corosync
    
  6. Create the cluster using host1 (so that host2 has a cluster to join to)

  7. In order for clustering to initially succeed, all cluster members must only have a link0 within corosync associated with Tailscale (if any other links exists within corosync, they must be temporarily removed for this initial cluster member addition to succeed); to have host2 join the cluster of host1, then run from host2: pvecm add host1 --link0 100.64.2.2

  8. You should SSH in from host1 to host2 and vice versa; if this isn't done, then tasks like migrations and replications may not work until performed:

    • ssh host1
    • ssh host2
  9. That should do it! Test, test, test!

To add a third member to the cluster (and so on), repeat these similar steps.


🔧 Troubleshooting 🔧

Should clustering not be successful, you'll need to do two things:

  1. Remove the err'd member from host1 by running: pvecm delnode host2
  2. Reset clustering on host2 by running: systemctl stop pve-cluster corosync; pmxcfs -l; rm -rf /etc/corosync/*; rm /etc/pve/corosync.conf; killall pmxcfs; systemctl start pve-cluster; pvecm updatecerts;

Then try again.

@protonaut
Copy link

protonaut commented Jun 22, 2024

Hi, when I follow your instruction I got stuck within joining host 2 into the cluster because I had to create the cluster with an link0 containing a non tailscale IP.
Seems to be the case, that there is an additional linux bridge needed or how did you create the cluster with a link0 containing an IP address from tailscale?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment