Skip to content

Instantly share code, notes, and snippets.

@scyto
Last active December 26, 2024 14:36
Show Gist options
  • Save scyto/76e94832927a89d977ea989da157e9dc to your computer and use it in GitHub Desktop.
Save scyto/76e94832927a89d977ea989da157e9dc to your computer and use it in GitHub Desktop.
my proxmox cluster

ProxMox Cluster - Soup-to-Nutz

aka what i did to get from nothing to done.

note: these are designed to be primarily a re-install guide for myself (writing things down helps me memorize the knowledge), as such don't take any of this on blind faith - some areas are well tested and the docs are very robust, some items, less so). YMMV

Purpose of Proxmox cluster project

Required Outomces of cluster project

image

The first 3 NUCs are the new proxmox cluster, the second set of 3 NUCs is the old Hyper-V nodes.

Updates as of 9/30/2023 This cluster is no longer a PoC and is my production cluster for all my VMs and docker containers (in VM based swarm).

All my initial objectives have been achivied and then some. All VMs migrated from Hyper-V and working - despite some stupidty on my part - though i learnt a lot!)

I will update if and when i make major changes, redesign or add new capabilities, but to be clear i now consider this gist set complete for my needs and have no more edits planned.

If you spot a critical type let me know and I can change but as as these are notes for me (not a tutorial) i make no promises :-)

Outcomes

  1. Hardware and Base Proxmox Install

  2. Thunderbolt Mesh Networking Setup

  3. Enable OSPF Routing On Mesh network - deprecated - old gist here

  4. Enable Dual Stack (IPv4 and IPv6) Openfabric Routing on Mesh Network

  5. Setup Cluster

  6. Setup Ceph and High Availability

  7. Create CephFS and storage for ISOs and CT Templates

  8. Setup HA Windows Server VM + TPM

  9. How to migrate Gen2 Windows VM from Hyper-V to Proxmox

    1. Notes on migrating my real world domain controller #2
    2. Notes on migrating my real world domain controller #1 (FSMO holder, AAD Sync and CA server)
    3. Notes on migrating my windows (server 2019) admin center VM
  10. Migrate HomeAssistant VM from Hyper-V

  11. Migrate my debian VM based docker swarm from Hyper-V to proxmox

  12. Extra Credit (optional):

    1. Enable vGPU Passthrough (+windows guest, CT guest configs
    2. Install Lets Encrypt Cert (CloudFlare as DNS Provder
    3. Azure Active Directory Auth
    4. Install Proxmox Backup Server (PBS) on synology with CIFS backend
    5. Send email alerts via O365 using Postfix HA Container
  13. Random Notes & Troubleshootig

TODO

  • add TLS to the mail relay? with LE certs? maybe?
  • maybe send syslog to my syslog server (securely)
  • figure out ceph public/cluster running on different networks - unclear its needed for this size of install
  • get all nodes listening to my network UPS and shut down before power runs out
  • For the docker VMs implement both cephfs via virtiofs for and a cephs docker volume and test which i like best in a swarm - using this ceph volume guide and this mounting guide by Drallas - using one of these three ceph volume plugins Brindster/docker-plugin-cephfs flaviostutz/cepher n0r1sk/docker-volume-cephfs each has different strengths and weaknesses (i will like choose either the n0r1sk or the Brindster one).

Purpose of cluster

I have been using Hyper-V for my docker swarm cluster VM hosts (see other gists). Original intenttion was to try and get Thunderbolt Networking for a Hyper-V cluster going and clustered storage for the VMs. This turns out to be super hard when using NUCs as cluster nodes due to too few disks. I looked at solar winds as alternative but this was both complex and not pervasive.

I had been watching proxmox for years and thought now was a good time to jump in and see what it is all about. (i had never booted or looked at proxmox UI before doing this - so this documentation is soup to nuts and intended for me to repro if needed)

Goals of Cluster

  1. VMs running on clustered storage {completed}
  2. Use of ThunderBolt for ~26Gbe Cluster VM operations (replication, failover etc)
    • Thunderbolt meshs with OSPF routing {completed}
    • Ceph over thunderbolt mesh {completed}
    • VM running with live migration {completed}
    • VM running with HA failove of node failure {completed}
    • Seperate VM/CT Migration network over thunderbolt mesh {not started}
  3. Use low powered off the shelf Intel NUCs {completed}
  4. Migrate VMs from Hyper-V:
    • Windows Server Domain Controler / DNS / DHCP / CA / AAD SYNC VMs {not started}
    • Debian Dcoker Host (for my 3 running 3 node swarm) VMs {not started}
    • HomeAssistant VM {not started}
  5. Sized to last me 5+ years (lol, yeah, right)

Hardware Selected

  1. 3x 13th Gen Intel NUCs (NUC13ANHi7):
    • Core i7-1360P Processor(12 Cores, 5.0 GHz, 16 Threads)
    • Intel Iris Xe Graphics
    • 64 GB DDR4 3200 CL22 RAM
    • Samsung 870 EVO SSD 1TB Boot Drive
    • Samsung 980 Pro NVME 2 TB Data Drive
    • 1x Onboard 2.5Gbe LAN Port
    • 2x Onboard Thunderbolt4 Ports
    • 1 x 2.5Gbe usinng Intel NUCIOALUWS nvme epxansion port
  2. 3 x OWC TB4 Cables

Key Software Components Used

  1. Proxmox v8.x
  2. Ceph (included with Proxmox)
  3. LLDP (included with Proxmox)
  4. Free Range Routing - FRR OSPF - (included with Proxmox)
  5. nano ;-)

Key Resources Leveraged

Proxmox/Ceph Guide from packet pushers

Proxmox Forum - several community members were invaluable in providing me a breadcrumb trail.

systemd.link manual pages

udevadm manual

udev manual

@chrissi5120
Copy link

what do you guys think about intel z890 with native thunderbolt 4?
This setup comes in with a bigger footprint but is more versatile..

Sadly, i cant find anything about tb4 networking on these boards, but being "native tb4" they should do the trick?

@scyto what is your opinion if i may ask?

@scyto
Copy link
Author

scyto commented Nov 21, 2024

@chrissi5120 short version is you want to get a board that uses a "Software Connection Manager" these are the only ones capable of cross domain channel bonding to get the 26Gbps through put (40gbe reported connection).

long version
I have had a support email thread with ASUS for the last 6 months trying to get sense out of them about which motherboards do and don't have that. For example I have prove that all Z790 motherboards used a hardware connection manager and discrete thunderbolt chip. This was stupid as the 13th and 14th gen processors have the needed controllers on the chipset/CPU to do software connection manager. I paid $1000 for my maximus extreme, imagine how annoyed i am that it isn't full TB4/USB4 compliant.

I have asked for a list of their new Z890 motherboards, the reply wasn't clear if they use a SW connection manager or not. I give the ones that use a thunderbolt 4/5 add-in cards the lowest chance of having have a SW connection manager.

If you know someone running one of the board with latest windows 11 its easy to tell - in device manager they will see USB4 router devices and in the setting apps there will be a new page about USB4 domains.

I won't be buying a z890 based system until i have evidence the board runs a software connection manager.

whats wild is ASUS refuse to believe me (despite sending them copius evidence) that their motherboards cannnot do 40Gbe connection speed for peer to peer networking over thunder bolt. I am pretty damn annoyed at them. I really need someone at level1tech or nexus to go get sense out of them.

All devices that use mobile chipsets and cpus (NUCs, MS01, ZimaCubePro etc all seem to have full TB4 support).

Hope that helps you dodge a bullet. Or buy a a z890, test it, report back and send for refund if it doesn't support SW CM ;-)

@chrissi5120
Copy link

@chrissi5120 short version is you want to get a board that uses a "Software Connection Manager" these are the only ones capable of cross domain channel bonding to get the 26Gbps through put (40gbe reported connection).

long version I have had a support email thread with ASUS for the last 6 months trying to get sense out of them about which motherboards do and don't have that. For example I have prove that all Z790 motherboards used a hardware connection manager and discrete thunderbolt chip. This was stupid as the 13th and 14th gen processors have the needed controllers on the chipset/CPU to do software connection manager. I paid $1000 for my maximus extreme, imagine how annoyed i am that it isn't full TB4/USB4 compliant.

I have asked for a list of their new Z890 motherboards, the reply wasn't clear if they use a SW connection manager or not. I give the ones that use a thunderbolt 4/5 add-in cards the lowest chance of having have a SW connection manager.

If you know someone running one of the board with latest windows 11 its easy to tell - in device manager they will see USB4 router devices and in the setting apps there will be a new page about USB4 domains.

I won't be buying a z890 based system until i have evidence the board runs a software connection manager.

whats wild is ASUS refuse to believe me (despite sending them copius evidence) that their motherboards cannnot do 40Gbe connection speed for peer to peer networking over thunder bolt. I am pretty damn annoyed at them. I really need someone at level1tech or nexus to go get sense out of them.

All devices that use mobile chipsets and cpus (NUCs, MS01, ZimaCubePro etc all seem to have full TB4 support).

Hope that helps you dodge a bullet. Or buy a a z890, test it, report back and send for refund if it doesn't support SW CM ;-)

ah man... thank you so much for your quick feedback. i was just about to be a smartass and buy last-gen intel hardware on ebay..
I will definitely not buy z890 based on your explanation and go for NUC-like hardware, propably exactly with your setup..

The thing is, i wanted to save some bucks in the beginning and expand later with more and better additional hardware / a better upgrade-path.

I think your reasoning has a specifically smart angle for someone from Germany: Currently, I pay 32 euro cents per kWh and based on that fact alone, a NUC-Setup will save me a lot of money in the long run.

Thank you very much again. Stuff like this is just pure gold for the community.

@chrissi5120
Copy link

just one more "low-cost" idea..

AM4/5 Mainboard with G-Series CPU(GPU integration)
up to 128GByte memory
PCIE Networkcard like Dell EMC Broadcom BCM 57414 which is said to just sip power at around 5W
DAC-Cabling

this might just come out much cheaper but has of cource a giant footprint compared to your solution.

the guide you provided could be forked and reused 99% for this kind of setup i think?

Power consumption would be "much" higher but still manageable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment