Skip to content

Instantly share code, notes, and snippets.

@thaynes43
Last active July 3, 2024 09:42
Show Gist options
  • Save thaynes43/6135cdde0b228900d70ab49dfe386f91 to your computer and use it in GitHub Desktop.
Save thaynes43/6135cdde0b228900d70ab49dfe386f91 to your computer and use it in GitHub Desktop.
Proxmox MS-01 Cluster w/ Ceph Ring Network

Shopping List

I was looking for mini PCs with SFP+ and found a lot of fairly expensive small servers that were tempting. Then I got lucky and saw a new product coming out from minisforum, the MS-01, which had everything I needed at a much lower pricepoint.

image

image

I went with the 20 core intel i9-13900H but I think any of the three would have been fine for my needs.

This thread has a lot of great recommendataions on what is compatible.

I got six of these from ebay to run my Ceph OSDs against.

image

image

Went with 96GB which wasn't officially supported but works. There are many other options if you want to save some money. I ended up buying these on Newegg because Amazon had a 15 day wait.

Went with this for the boot drive because it was fast and cheap.

image

https://www.amazon.com/dp/B0CK39YR9V?psc=1&ref=ppx_yo2ov_dt_b_product_details

Cable Matters USB 4 Cable 3.3 ft

https://www.amazon.com/dp/B094STPLX3?ref=ppx_yo2ov_dt_b_product_details&th=1

For Ceph ring network

SFP+ DAC

Currently going with 1 per device, not sure if seperating networks or LAN aggregation would be a good idea.

https://store.ui.com/us/en/collections/unifi-accessory-tech-cable-sfp/products/10gbps-direct-attach-cable?variant=uacc-dac-sfp10-3m&category=ce60c14e-dd22-47a9-b051-37e1a48b8d4f

Update to no sub repos

Navigate to Datecenter -> your server and then go to the next menu and select Updates -> repositories

image

Add No Subscription repository:

image

Disable pve-enterprise.list

image

Add no-subscription for ceph-quincy and disable the enterprise version

image

image

Your repositories should now look like this

image

Updating microcode

Now that the repositories are up to date we can pull packages and update microcode which is critical for this device to function (per instert youtube video here)

First we need to add someting to one of the package sources:

in /etc/apt/sources.list add non-free-firmware to deb http://ftp.us.debian.org/debian bookworm main contrib

image

Save the file and take a snapshot of the microcode version with the following command:

grep 'stepping\|model\|microcode' /proc/cpuinfo

Now update the microcode by running:

$ apt clean
$ apt update
$ apt install intel-microcode

Reboot to apply the new microcode:

image

Once it comes back up run the grep again to see the new version of microcode:

grep 'stepping\|model\|microcode' /proc/cpuinfo

Confirm the version changed, probably only need to check the first hex value but I diffed the entire output:

image

Ring Network

This thread has some good benchmarks to compare against.

This file was edited

image

Then I did this

root@pve01:~# udevadm monitor
monitor will print the received events for:
UDEV - the event which udev sends out after rule processing
KERNEL - the kernel uevent

KERNEL[732628.801886] remove   /devices/pci0000:00/0000:00:0d.3/domain1/1-0/usb4_port1/1-0:1.1/nvm_non_active0 (nvmem)
KERNEL[732628.801908] remove   /devices/pci0000:00/0000:00:0d.3/domain1/1-0/usb4_port1/1-0:1.1/nvm_active0 (nvmem)
KERNEL[732628.801923] remove   /devices/pci0000:00/0000:00:0d.3/domain1/1-0/usb4_port1/1-0:1.1 (thunderbolt)
UDEV  [732628.804386] remove   /devices/pci0000:00/0000:00:0d.3/domain1/1-0/usb4_port1/1-0:1.1/nvm_non_active0 (nvmem)
UDEV  [732628.804544] remove   /devices/pci0000:00/0000:00:0d.3/domain1/1-0/usb4_port1/1-0:1.1/nvm_active0 (nvmem)
UDEV  [732628.804706] remove   /devices/pci0000:00/0000:00:0d.3/domain1/1-0/usb4_port1/1-0:1.1 (thunderbolt)
KERNEL[732628.805618] remove   /devices/pci0000:00/0000:00:0d.3/domain1/1-0/1-1 (thunderbolt)
UDEV  [732628.805780] remove   /devices/pci0000:00/0000:00:0d.3/domain1/1-0/1-1 (thunderbolt)
KERNEL[732633.651350] add      /devices/pci0000:00/0000:00:0d.3/domain1/1-0/usb4_port1/1-0:1.1 (thunderbolt)
UDEV  [732633.654275] add      /devices/pci0000:00/0000:00:0d.3/domain1/1-0/usb4_port1/1-0:1.1 (thunderbolt)
KERNEL[732633.662074] add      /devices/pci0000:00/0000:00:0d.3/domain1/1-0/usb4_port1/1-0:1.1/nvm_active0 (nvmem)
KERNEL[732633.662088] add      /devices/pci0000:00/0000:00:0d.3/domain1/1-0/usb4_port1/1-0:1.1/nvm_non_active0 (nvmem)
UDEV  [732633.662430] add      /devices/pci0000:00/0000:00:0d.3/domain1/1-0/usb4_port1/1-0:1.1/nvm_active0 (nvmem)
UDEV  [732633.662937] add      /devices/pci0000:00/0000:00:0d.3/domain1/1-0/usb4_port1/1-0:1.1/nvm_non_active0 (nvmem)
KERNEL[732637.971167] change   /1-1 (thunderbolt)
UDEV  [732637.973575] change   /1-1 (thunderbolt)
KERNEL[732638.991250] add      /devices/pci0000:00/0000:00:0d.3/domain1/1-0/1-1 (thunderbolt)
UDEV  [732638.991928] add      /devices/pci0000:00/0000:00:0d.3/domain1/1-0/1-1 (thunderbolt)

Then I copy pasted these

nano /etc/systemd/network/00-thunderbolt0.link

[Match]
Path=pci-0000:00:0d.3
Driver=thunderbolt-net
[Link]
MACAddressPolicy=none
Name=en05

And for #2

nano /etc/systemd/network/00-thunderbolt1.link

[Match]
Path=pci-0000:00:0d.2
Driver=thunderbolt-net
[Link]
MACAddressPolicy=none
Name=en06

FIX NETWORK AFTER BOOT

This part is still broken but to work around I added a few suggestions I found on the good gist to my interface file...

auto lo
iface lo inet loopback

# Begin thunderbolt edits

auto lo:0
iface lo:0 inet static
        address 10.0.0.81/32

auto lo:6
iface lo:6 inet static
        address fc00::81/128

# End thunderbolt edits

iface enp2s0f0np0 inet manual

auto vmbr0
iface vmbr0 inet static
        address 192.168.0.34/24
        gateway 192.168.0.1
        bridge-ports enp2s0f0np0
        bridge-stp off
        bridge-fd 0

iface enp87s0 inet manual

iface enp90s0 inet manual

iface enp2s0f1np1 inet manual

iface wlp91s0 inet manual

# Begin thunderbolt edits

auto en05
allow-hotplug en05
iface en05 inet manual
       mtu 65520

iface en05 inet6 manual
        mtu 65520

auto en06
allow-hotplug en06
iface en06 inet manual
        mtu 65520

iface en06 inet6 manual
        mtu 65520

# End thunderbolt edits

source /etc/network/interfaces.d/*

# TB last line
post-up /usr/bin/systemctl reset-failed frr.service
post-up /usr/bin/systemctl restart frr.service

And then I added a super hacky cronjob at boot which fixed everything:

@reboot sleep 60 && /usr/bin/systemctl restart frr.service

Windows 2022 - follow other gist

HAOS

HAOS dark magic - https://community.home-assistant.io/t/installing-home-assistant-os-using-proxmox-8/201835

K8

We are going to have 3 workers fixed to each MS-01 using the host M.2 and one HA control plane (master) node which will not run pods but will make the control plane highly available so if one of the fixed workers go down it will be able to migrade pods.

Start with Debian This is good: https://i12bretro.github.io/tutorials/0191.html

Increasing storage is easy but decreasing can lead to catastrphic failure. Start small with storage!

Each MS-01 gets a beefy one:

image

And the MS-01 with only one VM (since we have windows and HAOS) can get the control plane. Make sure to use the Ceph VM Disks volume created in the good gist.

image

Add the control plane node to Datacenter -> HA -> Add and select the group created with the gist

image

Configure VM

First add yourself to sudo so you can do anything:

su root 
nano /etc/sudoers

Do this: image

I read to intall qemu but it seems to be there, to make sure qemu you can run:

sudo apt update
sudo apt install qemu-guest-agent -y
sudo systemctl enable qemu-guest-agent --now

The last command doesn't seem to do anything but I'm not worried yet...

Enable qemu guest agent

image

Eject the ISO while you're at it:

image

Install NOMACHINE so we can copy paste!

Sice I couldn't copy paste I just went on the VM to firefox and downloaded https://downloads.nomachine.com/download/?id=1

Then run the command on the page:

sudo dpkg -i nomachine_8.11.3_4_amd64.deb

At this point you might as well jump ahead and disable swap since we should reboot and switch to nomachine.

This is also a good time to fix the IP (or configure in VM as static) and add a DNS record so we don't need the ip:

image

Install docker here: https://docs.docker.com/engine/install/debian/

They say to remove conflicts but we should be clean:

for pkg in docker.io docker-doc docker-compose podman-docker containerd runc; do sudo apt-get remove $pkg; done

Then add the libraries:

# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/debian/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

# Add the repository to Apt sources:
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/debian \
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update

Now everything is ready to install:

sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

And test!

sudo docker run hello-world

image

Read about K8 and get ready: https://kubernetes.io/docs/setup/

Get kubectl going: https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/

I recommend following the official document but the commands I ran are documented here for comparison...

Make sure to use amd64 packages for the Debian VMs. I ran these in ~/ but you may want to in Downloads.

Get installer:

 curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"

Validate:

curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl.sha256"
echo "$(cat kubectl.sha256)  kubectl" | sha256sum --check

OK:

image

Install it since it's OK

sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl

Validate

kubectl version --client

image

It's a bit confusing here because it says to run kubectl cluster-info but we have not even begun to join a cluster. However, scroll further and we will find some useful things to run.

Autocomplete script should dump if you run type _init_completion. If so then run:

echo 'source <(kubectl completion bash)' >>~/.bashrc
kubectl completion bash | sudo tee /etc/bash_completion.d/kubectl > /dev/null
sudo chmod a+r /etc/bash_completion.d/kubectl

Then I installed the kubectl convert just for kicks.

curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl-convert"

Validate

curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl-convert.sha256"
echo "$(cat kubectl-convert.sha256) kubectl-convert" | sha256sum --check

image

Install

sudo install -o root -g root -m 0755 kubectl-convert /usr/local/bin/kubectl-convert

Validate it installed:

kubectl convert --help

image

Cleanup all the stuff curl pulled:

rm kubectl-convert kubectl-convert.sha256
rm kubectl kubectl.sha256

Install weird docker shit: https://mirantis.github.io/cri-dockerd/

Modify wget below for latest release from here: https://github.com/Mirantis/cri-dockerd/releases

wget https://github.com/Mirantis/cri-dockerd/releases/download/v0.3.13/cri-dockerd-0.3.13.amd64.tgz
tar -xvf cri-dockerd-0.3.13.amd64.tgz
sudo mv cri-dockerd/cri-dockerd /usr/local/bin/
#Clean Up
rm -R cri-dockerd
rm cri-dockerd-0.3.13.amd64.tgz

Check it:

cri-dockerd --version

image

Now get it running:

wget https://raw.githubusercontent.com/Mirantis/cri-dockerd/master/packaging/systemd/cri-docker.service
wget https://raw.githubusercontent.com/Mirantis/cri-dockerd/master/packaging/systemd/cri-docker.socket
sudo mv cri-docker.socket cri-docker.service /etc/systemd/system/
sudo sed -i -e 's,/usr/bin/cri-dockerd,/usr/local/bin/cri-dockerd,' /etc/systemd/system/cri-docker.service
sudo systemctl daemon-reload
sudo systemctl enable cri-docker.service
sudo systemctl enable --now cri-docker.socket

Check

systemctl status cri-docker.socket

image

Get ready for Cluster!

Using this doc: https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm

Don't worry about the port yet. Disable swap first:

Run lsblk and see swap.

Run free -h and see space for swap.

sudo nano /etc/fstab  

Comment out the line with swap in it:

image

reboot

Run lsblk and see no more swap.

Run free -h and see 0B for swap.

Install all this shit: https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/

Get confused by cgroup drivers, hope all is well

Now onto this shit that is like the first shit but more confusing: https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/

Install kubelet and kubeadm:

sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl gpg
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.30/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.30/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl
sudo systemctl enable --now kubelet

More about cgroup drivers, just skip until we see if we need it...

Before we can init the cluster we need to install a pod netwrok add-on.

Reference (general): https://k8s-docs.netlify.app/en/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/

Another: https://theitbros.com/set-up-kubernetes-on-proxmox/

It gets a bit confusing here. I think flannel will be needed but that seems to come after the big one:

sudo kubeadm init

Something bad:

thaynes@kubevip:~$ sudo kubeadm init
Found multiple CRI endpoints on the host. Please define which one do you wish to use by setting the 'criSocket' field in the kubeadm configuration file: unix:///var/run/containerd/containerd.sock, unix:///var/run/cri-dockerd.sock
To see the stack trace of this error execute with --v=5 or higher

Try 2

sudo kubeadm init --cri-socket /var/run/cri-dockerd.sock

VICTORY

MISSED --pod-network-cidr=10.244.0.0/16

FIX:

See nothing from kubectl get nodes -o jsonpath='{.items[*].spec.podCIDR}'? Do this:

edit /etc/kubernetes/manifests/kube-controller-manager.yaml at command add --allocate-node-cidrs=true --cluster-cidr=10.244.0.0/16

Then restart to take the config:

sudo systemctl restart kubelet

Verify all good:

sudo systemctl status kubelet

But back to the action:

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join REDACTED

Guess it's flannel time!

https://github.com/flannel-io/flannel

kubectl apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml

That worked, now add the nodes. Just use the command it spits out but append:

--cri-socket /var/run/cri-dockerd.sock

After some flannel troubleshooting we got healthy pods!

First off is the official dashboard:

https://kubernetes.io/docs/tasks/access-application-cluster/web-ui-dashboard/

# Add kubernetes-dashboard repository
helm repo add kubernetes-dashboard https://kubernetes.github.io/dashboard/
# Deploy a Helm Release named "kubernetes-dashboard" using the kubernetes-dashboard chart
helm upgrade --install kubernetes-dashboard kubernetes-dashboard/kubernetes-dashboard --create-namespace --namespace kubernetes-dashboard

It dunps out the goods:

thaynes@kubevip:~$ helm upgrade --install kubernetes-dashboard kubernetes-dashboard/kubernetes-dashboard --create-namespace --namespace kubernetes-dashboard
Release "kubernetes-dashboard" has been upgraded. Happy Helming!
NAME: kubernetes-dashboard
LAST DEPLOYED: Wed May  8 22:25:51 2024
NAMESPACE: kubernetes-dashboard
STATUS: deployed
REVISION: 2
TEST SUITE: None
NOTES:
*************************************************************************************************
*** PLEASE BE PATIENT: Kubernetes Dashboard may need a few minutes to get up and become ready ***
*************************************************************************************************

Congratulations! You have just installed Kubernetes Dashboard in your cluster.

To access Dashboard run:
  kubectl -n kubernetes-dashboard port-forward svc/kubernetes-dashboard-kong-proxy 8443:443

NOTE: In case port-forward command does not work, make sure that kong service name is correct.
      Check the services in Kubernetes Dashboard namespace using:
        kubectl -n kubernetes-dashboard get svc

Dashboard will be available at:
  https://localhost:8443

First fowward:

kubectl -n kubernetes-dashboard port-forward svc/kubernetes-dashboard-kong-proxy 8443:443

Then run:

kubectl -n kubernetes-dashboard get svc

It wants a token:

image

Grab one for default account under dasboard namespace:

kubectl -n kubernetes-dashboard create token default

And we're in!

image

But nothing good - make a user!

https://github.com/kubernetes/dashboard/blob/master/docs/user/access-control/creating-sample-user.md

Need a file per code block:

image

The secret didn't work but this did:

kubectl -n kubernetes-dashboard create token admin-user

Will figure out the secret later. Got CPU and memory working by adding some args that are hard to find in the docs:

helm upgrade --install kubernetes-dashboard kubernetes-dashboard/kubernetes-dashboard --create-namespace --namespace kubernetes-dashboard --set=service.externalPort=8080,resources.limits.cpu=200m,metricsScraper.enabled=true

CEPH TIME

https://github.com/ceph/ceph-csi

Steps look alright: https://github.com/ceph/ceph-csi/blob/devel/docs/deploy-cephfs.md

Prereqs will take some time figuring out though...

Your Kubernetes cluster must allow privileged pods (i.e. --allow-privileged flag must be set to true for both the API server and the kubelet). Moreover, as stated in the mount propagation docs, the Docker daemon of the cluster nodes must allow shared mounts.

OK THIS IS IT! https://devopstales.github.io/kubernetes/k8s-cephfs-storage-with-csi-driver/

  1. It says allow-privileged is now the default (after 1.1 release) so we're gonna go create the CephFS

We did something like that for ISOs here https://gist.github.com/scyto/941b24efd1ac0bf9b3cd30c3fb1e5341

Add a new CephFS from a node

image

Name it and add as storage:

image

You should see it hook to each VM:

image

Now let's see if we can use it.

But first, some cool tools!

https://github.com/ahmetb/kubectx

sudo apt install kubectx

Then install ceph

sudo apt-get install ceph

1 - give access to the public ceph network to the VM. Try jumbo frames if you use them. 2 - create ceph.conf and admin key file on vm. 3 - try to see ceph -s 4 - mount RBD or CephFS inside the vm

OK #1,

Before: image

Make a bond with em all: image

Add it to the bride: image

Rock and roll: image

And it's fucked

image

Quick reload:

ifreload -a

First this:

1 - give access to the public ceph network to the VM. Try jumbo frames if you use them.
2 - create ceph.conf and admin key file on vm.
3 - try to see ceph -s
4 - mount RBD or CephFS inside the vm

Then this:

https://devopstales.github.io/kubernetes/k8s-cephfs-storage-with-csi-driver/

ON Flip ceph to private / public: https://www.reddit.com/r/Proxmox/comments/p7s8ne/change_ceph_network/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment