Skip to content

Instantly share code, notes, and snippets.

@detiber
Last active February 12, 2024 00:31
Show Gist options
  • Save detiber/0516b642ec6979f220409d9524c7eae7 to your computer and use it in GitHub Desktop.
Save detiber/0516b642ec6979f220409d9524c7eae7 to your computer and use it in GitHub Desktop.
CAPT/Tink/image-builder

Tinkerbell w/ image-builder on Vagrant (WIP):

Prerequisites

  • Only currently runs on Linux w/ Vagrant/Libvirt
  • tilt
  • kind
  • checkout image builder to...
  • checkout sandbox to...
  • checkout cluster-api to ...
  • checkout cluster-api-provider-tink to ...

Setup Tinkerbell

using image-builder (kubernetes-sigs/image-builder#547):

cd <image-builder>/images/capi
make build-raw-all
cp output/ubuntu-1804-kube-v1.18.15.gz $SANDBOX_DIR/deploy/state/webroot/

using sandbox (https://github.com/tinkerbell/sandbox):

export TINKERBELL_CONFIGURE_NAT=false
export TINKERBELL_NUM_WORKERS=3
vagrant up provisioner --no-destroy-on-error
vagrant ssh provisioner

# workaround for https://github.com/tinkerbell/sandbox/issues/62
sudo curl -L "https://github.com/docker/compose/releases/download/1.26.0/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose
exit

vagrant provision provisioner
vagrant ssh provisioner

once ssh'd into the provisioner

cd /vagrant && source .env && cd deploy
docker-compose up -d

# Hack to work around limitations of current Tinkerbell event system (postgres triggers)
PGPASSWORD=tinkerbell docker exec deploy_db_1 psql -U tinkerbell -c 'drop trigger events_channel ON events;'

# TODO: add 169.254.169.254 link-local address to provisioner machine
# TODO: figure out how we can incorporate this into sandbox
# TODO: will this cause issues in EM deployments?
# edit /etc/netplan.eth1.yaml
# add 169.254.169.254/16 to the addresses
# netplan apply

# setup hook as a replacement for OSIE (https://github.com/tinkerbell/hook#the-manual-way)
pushd /vagrant/deploy/state/webroot/misc/osie
mv current current-bak
mkdir current
wget http://s.gianarb.it/tinkie/tinkie-master.tar.gz
tar xzv -C ./current -f tinkie-master.tar.gz
popd

# TODO: follow up on not needing to pull/tag/push images to internal registry for actions
# TODO: requires changes to tink-worker to avoid internal registry use
docker pull quay.io/tinkerbell-actions/image2disk:v1.0.0
docker tag quay.io/tinkerbell-actions/image2disk:v1.0.0 192.168.1.1/image2disk:v1.0.0
docker push 192.168.1.1/image2disk:v1.0.0
docker pull quay.io/tinkerbell-actions/writefile:v1.0.0
docker tag quay.io/tinkerbell-actions/writefile:v1.0.0 192.168.1.1/writefile:v1.0.0
docker push 192.168.1.1/writefile:v1.0.0
docker pull quay.io/tinkerbell-actions/kexec:v1.0.0
docker tag quay.io/tinkerbell-actions/kexec:v1.0.0 192.168.1.1/kexec:v1.0.0
docker push 192.168.1.1/kexec:v1.0.0

# TODO: investigate hegel metadata not returning proper values for 2009-04-04/meta-data/{public,local}-ipv{4,6}, currently trying to return values from hw.metadata.instance.network.addresses[] instead of hw.network.interfaces[]
# TODO: should hegel (or tink) automatically populate fields from root sources, for example metadata.instance.id from id
#       public/local ip addresses from network.addresses, etc?
# TODO: automatic hardware detection to avoid needing to manually populate metadata.instance.storage.disks[].device
cat > hardware-data-worker-1.json <<EOF
{
  "id": "ce2e62ed-826f-4485-a39f-a82bb74338e2",
  "metadata": {
    "facility": {
      "facility_code": "onprem"
    },
    "instance": {
      "id": "ce2e62ed-826f-4485-a39f-a82bb74338e2",
      "hostname": "test-instance1",
      "storage": {
        "disks": [{"device": "/dev/vda"}]
      }
    },
    "state": ""
  },
  "network": {
    "interfaces": [
      {
        "dhcp": {
          "arch": "x86_64",
          "ip": {
            "address": "192.168.1.5",
            "gateway": "192.168.1.1",
            "netmask": "255.255.255.248"
          },
          "mac": "08:00:27:00:00:01",
          "uefi": false
        },
        "netboot": {
          "allow_pxe": true,
          "allow_workflow": true
        }
      }
    ]
  }
}
EOF
docker exec -i deploy_tink-cli_1 tink hardware push < ./hardware-data-worker-1.json

cat > hardware-data-worker-2.json <<EOF
{
  "id": "ce2e62ed-826f-4485-a39f-a82bb74338e3",
  "metadata": {
    "facility": {
      "facility_code": "onprem"
    },
    "instance": {
      "id": "ce2e62ed-826f-4485-a39f-a82bb74338e3",
      "hostname": "test-instance2",
      "storage": {
        "disks": [{"device": "/dev/vda"}]
      }
    },
    "state": ""
  },
  "network": {
    "interfaces": [
      {
        "dhcp": {
          "arch": "x86_64",
          "ip": {
            "address": "192.168.1.4",
            "gateway": "192.168.1.1",
            "netmask": "255.255.255.248"
          },
          "mac": "08:00:27:00:00:02",
          "uefi": false
        },
        "netboot": {
          "allow_pxe": true,
          "allow_workflow": true
        }
      }
    ]
  }
}
EOF
docker exec -i deploy_tink-cli_1 tink hardware push < ./hardware-data-worker-2.json

cat > hardware-data-worker-3.json <<EOF
{
  "id": "ce2e62ed-826f-4485-a39f-a82bb74338e4",
  "metadata": {
    "facility": {
      "facility_code": "onprem"
    },
    "instance": {
      "id": "ce2e62ed-826f-4485-a39f-a82bb74338e4",
      "hostname": "test-instance3",
      "storage": {
        "disks": [{"device": "/dev/vda"}]
      }
    },
    "state": ""
  },
  "network": {
    "interfaces": [
      {
        "dhcp": {
          "arch": "x86_64",
          "ip": {
            "address": "192.168.1.3",
            "gateway": "192.168.1.1",
            "netmask": "255.255.255.248"
          },
          "mac": "08:00:27:00:00:03",
          "uefi": false
        },
        "netboot": {
          "allow_pxe": true,
          "allow_workflow": true
        }
      }
    ]
  }
}
EOF
docker exec -i deploy_tink-cli_1 tink hardware push < ./hardware-data-worker-3.json

Setup Cluster API and the Tinkerbell Provider

Get the tilt environment configured and running:

git clone https://github.com/detiber/cluster-api-provider-tink -b withImages
git clone https://github.com/kubernetes-sigs/cluster-api -b release-0.3
cd cluster-api
cat > tilt-settings.json <<EOF
{
    "default_registry": "gcr.io/detiber",
    "provider_repos": ["../../tinkerbell/cluster-api-provider-tink"],
    "enable_providers": ["tinkerbell", "kubeadm-bootstrap", "kubeadm-control-plane"],
    "kustomize_substitutions": {
        "EXP_CLUSTER_RESOURCE_SET": "true",
        "TINKERBELL_GRPC_AUTHORITY": "192.168.1.1:42113",
        "TINKERBELL_CERT_URL": "http://192.168.1.1:42114/cert",
        "TINKERBELL_IP": "192.168.1.1"
    }
}
EOF

kind create cluster
tilt up
cat > testhardware.yml <<EOF
---
kind: Hardware
apiVersion: tinkerbell.org/v1alpha1
metadata:
  name: test-hw1
spec:
  id: ce2e62ed-826f-4485-a39f-a82bb74338e2
---
kind: Hardware
apiVersion: tinkerbell.org/v1alpha1
metadata:
  name: test-hw2
spec:
  id: fe2e62ed-826f-4485-a39f-a82bb74338e3
---
kind: Hardware
apiVersion: tinkerbell.org/v1alpha1
metadata:
  name: test-hw3
spec:
  id: fe2e62ed-826f-4485-a39f-a82bb74338e4
EOF

kubectl create -f testhardware.yml



cat > testcluster.yml <<EOF
---
apiVersion: controlplane.cluster.x-k8s.io/v1alpha3
kind: KubeadmControlPlane
metadata:
  name: capi-quickstart-control-plane
  namespace: default
spec:
  infrastructureTemplate:
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
    kind: TinkerbellMachineTemplate
    name: capi-quickstart-control-plane
  kubeadmConfigSpec:
    users:
      - name: tink
        shell: /bin/bash
        groups: wheel
        lockPassword: false
        passwd: "$6$rounds=4096$eWIEHHK8BjHezE8F$cf2nkZF8Oy2ST4g6VMJTRU2c8yGbDMo6bZ1A6LEgZtlt0ec705o/0LJwkp8vKeczOrwdhc9wVd/igX0peTQhM1"
        sudo: "ALL=(ALL) NOPASSWD:ALL"
        sshAuthorizedKeys:
          - ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQC9Mh25GjJGqCQd+zAu49XoCaZvQCuNHZcFVPwREE1LgFlN6awfEScACMyTJLLGvkx8Jfe75uVgElhs4hMwgkHwOQ9fmF/rL7yqaNK1dnSfMFeLd2EO8flA6J9kXnNX+zXTzlz7tkqionZ+n103XHuEn0JHtYRr7LBwqf2P4itC4mQVOJzJZhdKPoIO9pVxiixBTYcgQ+oMiruBBHPNG2OyiB4kOA2y2qOrxuLuG2p5abU3A+hz/TnRU4EfxuETO/L3TH8U/KdO0rxapxAjf+QQeSJn9cjDR9pwe5I6aDVSspEscsXqjo1ctOcTgNimuHxYHSpgJLxiYCfEr1yVqQggcur2xu9/bCyhVZCKfqgew8KwsSD3bo5zrMPxXQ/Qm5Ee6jKDc9qv/i2+S2vDZShC5jxO6V4rpliwgZOaK3rRZDNzG1JuybYsnwNzpnobqaUZfYThrVjTxbc5JjDq9Ge2KE8U86j6Nq5nyNTxkLEtB9DGyHSMP6EVjcESSVPZrl0= detiber@loggerhead.local.detiberus.net
    clusterConfiguration: {}
    initConfiguration:
      nodeRegistration:
        kubeletExtraArgs:
          provider-id: PROVIDER_ID
    joinConfiguration:
      nodeRegistration:
        kubeletExtraArgs:
          provider-id: PROVIDER_ID
  replicas: 1
  version: v1.18.15
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: TinkerbellMachineTemplate
metadata:
  name: capi-quickstart-control-plane
  namespace: default
spec:
  template:
    spec: {}
---
apiVersion: cluster.x-k8s.io/v1alpha3
kind: Cluster
metadata:
  name: capi-quickstart
  namespace: default
spec:
  clusterNetwork:
    pods:
      cidrBlocks:
      - 192.168.0.0/16
    services:
      cidrBlocks:
      - 172.26.0.0/16
  controlPlaneRef:
    apiVersion: controlplane.cluster.x-k8s.io/v1alpha3
    kind: KubeadmControlPlane
    name: capi-quickstart-control-plane
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
    kind: TinkerbellCluster
    name: capi-quickstart
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: TinkerbellCluster
metadata:
  name: capi-quickstart
  namespace: default
spec: {}
---
apiVersion: cluster.x-k8s.io/v1alpha3
kind: MachineDeployment
metadata:
  labels:
    cluster.x-k8s.io/cluster-name: capi-quickstart
    pool: worker-a
  name: capi-quickstart-worker-a
  namespace: default
spec:
  clusterName: capi-quickstart
  replicas: 0
  selector:
    matchLabels:
      cluster.x-k8s.io/cluster-name: capi-quickstart
      pool: worker-a
  template:
    metadata:
      labels:
        cluster.x-k8s.io/cluster-name: capi-quickstart
        pool: worker-a
    spec:
      bootstrap:
        configRef:
          apiVersion: bootstrap.cluster.x-k8s.io/v1alpha3
          kind: KubeadmConfigTemplate
          name: capi-quickstart-worker-a
      clusterName: capi-quickstart
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
        kind: TinkerbellMachineTemplate
        name: capi-quickstart-worker-a
      version: v1.18.15
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: TinkerbellMachineTemplate
metadata:
  name: capi-quickstart-worker-a
  namespace: default
spec:
  template:
    spec: {}
---
apiVersion: bootstrap.cluster.x-k8s.io/v1alpha3
kind: KubeadmConfigTemplate
metadata:
  name: capi-quickstart-worker-a
  namespace: default
spec:
  template:
    spec:
      users:
        - name: tink
          groups: wheel
          lockPassword: false
          passwd: "$6$rounds=4096$eWIEHHK8BjHezE8F$cf2nkZF8Oy2ST4g6VMJTRU2c8yGbDMo6bZ1A6LEgZtlt0ec705o/0LJwkp8vKeczOrwdhc9wVd/igX0peTQhM1"
          sudo: "ALL=(ALL) NOPASSWD:ALL"
          sshAuthorizedKeys:
            - ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQC9Mh25GjJGqCQd+zAu49XoCaZvQCuNHZcFVPwREE1LgFlN6awfEScACMyTJLLGvkx8Jfe75uVgElhs4hMwgkHwOQ9fmF/rL7yqaNK1dnSfMFeLd2EO8flA6J9kXnNX+zXTzlz7tkqionZ+n103XHuEn0JHtYRr7LBwqf2P4itC4mQVOJzJZhdKPoIO9pVxiixBTYcgQ+oMiruBBHPNG2OyiB4kOA2y2qOrxuLuG2p5abU3A+hz/TnRU4EfxuETO/L3TH8U/KdO0rxapxAjf+QQeSJn9cjDR9pwe5I6aDVSspEscsXqjo1ctOcTgNimuHxYHSpgJLxiYCfEr1yVqQggcur2xu9/bCyhVZCKfqgew8KwsSD3bo5zrMPxXQ/Qm5Ee6jKDc9qv/i2+S2vDZShC5jxO6V4rpliwgZOaK3rRZDNzG1JuybYsnwNzpnobqaUZfYThrVjTxbc5JjDq9Ge2KE8U86j6Nq5nyNTxkLEtB9DGyHSMP6EVjcESSVPZrl0= detiber@loggerhead.local.detiberus.net
      joinConfiguration:
        nodeRegistration:
          kubeletExtraArgs:
            provider-id: PROVIDER_ID
EOF

kubectl create -f testcluster.yml
vagrant up worker worker1 worker2

#TODO: use in-repo template
#TODO: deploy CNI
#  - update cluster template to not use 192.168.0.0/16 for pod cidr
#  - use cilium by default to avoid having to modify calico default cidr?
#TODO: figure out why ubuntu-2004 doesn't work, but 18.04 does for kexec
#TODO: figure out cloud-init hostname not working
#TODO: CAPT: update bootstrapping documentation
#  - block device handling
#  - building and making image-builder based image(s) available
#TODO: add custom tinkerbell datasource for cloud-init or add support to the ec2 datasource for tinkerbell
#TODO: load balancing

#clusterctl config cluster capi-quickstart --from ./templates/cluster-template.yaml --kubernetes-version=v1.18.15 --control-plane-machine-count=1 --worker-machine-count=1 > test-cluster.yaml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment