Skip to content

Instantly share code, notes, and snippets.

@alexeldeib
Last active March 28, 2019 06:16
Show Gist options
  • Save alexeldeib/7143e3e065933e01a229559abd6db5e7 to your computer and use it in GitHub Desktop.
Save alexeldeib/7143e3e065933e01a229559abd6db5e7 to your computer and use it in GitHub Desktop.
AKS Engine ContainerOS Debug

ContainerOS AKS Engine

Goals

  • Working example API model
  • Enable masters to form etcd cluster
  • Enable masters to form k8s cluster
  • Enable agents to join k8s cluster

Notes

Generated a new cluster using the following command and API model:

.\_dist\aks-engine-c6e585d-windows-amd64\aks-engine.exe deploy -m .\kubernetes-coreos.json --auth-method cli -l westus2 -g coreos-k8s -s SUB --debug

API Model

{
  "apiVersion": "vlabs",
  "properties": {
    "orchestratorProfile": {
      "orchestratorRelease": "1.13",
      "orchestratorType": "Kubernetes",
      "kubernetesConfig": {
        "networkPlugin": "azure"
      }
    },
    "masterProfile": {
      "count": 3,
      "dnsPrefix": "coreos-k8s",
      "vmSize": "Standard_D4s_v3",
      "distro": "coreos"
    },
    "agentPoolProfiles": [
      {
        "name": "agentpool1",
        "count": 3,
        "vmSize": "Standard_E4_v3",
        "availabilityProfile": "VirtualMachineScaleSets",
        "storageProfile": "ManagedDisks",
        "distro": "coreos"
      }
    ],
    "linuxProfile": {
      "adminUsername": "ace",
      "ssh": {
        "publicKeys": [
          {
            "keyData": ""
          }
        ]
      }
    },
    "servicePrincipalProfile": {
      "clientId": "",
      "secret": ""
    }
  }
}

Output

DEBU[0000] Resolving tenantID for subscriptionID: REMOVED
DEBU[0007] Already registered for "Microsoft.Compute"
DEBU[0007] Already registered for "Microsoft.Storage"
DEBU[0007] Already registered for "Microsoft.Network"
DEBU[0015] pki: PKI asset creation took 4.5979s
DEBU[0015] output: wrote _output/coreos-k8s/apimodel.json
DEBU[0015] output: wrote _output/coreos-k8s/azuredeploy.json
DEBU[0015] output: wrote _output/coreos-k8s/azuredeploy.parameters.json
DEBU[0015] output: wrote _output/coreos-k8s/kubeconfig/kubeconfig.westus2.json
DEBU[0015] output: wrote _output/coreos-k8s/ca.key
DEBU[0015] output: wrote _output/coreos-k8s/ca.crt
DEBU[0015] output: wrote _output/coreos-k8s/apiserver.key
DEBU[0015] output: wrote _output/coreos-k8s/apiserver.crt
DEBU[0015] output: wrote _output/coreos-k8s/client.key
DEBU[0015] output: wrote _output/coreos-k8s/client.crt
DEBU[0015] output: wrote _output/coreos-k8s/kubectlClient.key
DEBU[0015] output: wrote _output/coreos-k8s/kubectlClient.crt
DEBU[0015] output: wrote _output/coreos-k8s/etcdserver.key
DEBU[0015] output: wrote _output/coreos-k8s/etcdserver.crt
DEBU[0015] output: wrote _output/coreos-k8s/etcdclient.key
DEBU[0015] output: wrote _output/coreos-k8s/etcdclient.crt
DEBU[0015] output: wrote _output/coreos-k8s/etcdpeer0.key
DEBU[0015] output: wrote _output/coreos-k8s/etcdpeer0.crt
DEBU[0015] output: wrote _output/coreos-k8s/etcdpeer1.key
DEBU[0015] output: wrote _output/coreos-k8s/etcdpeer1.crt
DEBU[0015] output: wrote _output/coreos-k8s/etcdpeer2.key
DEBU[0016] output: wrote _output/coreos-k8s/etcdpeer2.crt
INFO[0016] Starting ARM Deployment (coreos-k8s-2102500969). This will take some time...
INFO[0245] Finished ARM Deployment (coreos-k8s-2102500969). Error: Code="DeploymentFailed" Message="At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/arm-debug for usage details." Details=[{"code":"Conflict","message":"{\r\n  \"status\": \"Failed\",\r\n  \"error\": {\r\n    \"code\": \"ResourceDeploymentFailure\",\r\n    \"message\": \"The resource operation completed with terminal provisioning state 'Failed'.\",\r\n    \"details\": [\r\n      {\r\n        \"code\": \"VMExtensionProvisioningError\",\r\n        \"message\": \"VM has reported a failure when processing extension 'cse-master-1'. Error message: \\\"Enable failed: failed to execute command: command terminated with exit status=50\\n[stdout]\\n\\n[stderr]\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\n\\\".\"\r\n      }\r\n    ]\r\n  }\r\n}"},{"code":"Conflict","message":"{\r\n  \"status\": \"Failed\",\r\n  \"error\": {\r\n    \"code\": \"ResourceDeploymentFailure\",\r\n    \"message\": \"The resource operation completed with terminal provisioning state 'Failed'.\",\r\n    \"details\": [\r\n      {\r\n        \"code\": \"VMExtensionProvisioningError\",\r\n        \"message\": \"VM has reported a failure when processing extension 'cse-master-0'. Error message: \\\"Enable failed: failed to execute command: command terminated with exit status=50\\n[stdout]\\n\\n[stderr]\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\n\\\".\"\r\n      }\r\n    ]\r\n  }\r\n}"},{"code":"Conflict","message":"{\r\n  \"status\": \"Failed\",\r\n  \"error\": {\r\n    \"code\": \"ResourceDeploymentFailure\",\r\n    \"message\": \"The resource operation completed with terminal provisioning state 'Failed'.\",\r\n    \"details\": [\r\n      {\r\n        \"code\": \"VMExtensionProvisioningError\",\r\n        \"message\": \"VM has reported a failure when processing extension 'cse-master-2'. Error message: \\\"Enable failed: failed to execute command: command terminated with exit status=50\\n[stdout]\\n\\n[stderr]\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\n\\\".\"\r\n      }\r\n    ]\r\n  }\r\n}"},{"code":"Conflict","message":"{\r\n  \"status\": \"Failed\",\r\n  \"error\": {\r\n    \"code\": \"ResourceDeploymentFailure\",\r\n    \"message\": \"The resource operation completed with terminal provisioning state 'Failed'.\",\r\n    \"details\": [\r\n      {\r\n        \"code\": \"VMExtensionProvisioningError\",\r\n        \"message\": \"VM has reported a failure when processing extension 'vmssCSE'. Error message: \\\"Enable failed: failed to execute command: command terminated with exit status=50\\n[stdout]\\n\\n[stderr]\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No such file or directory\\ntimeout: failed to run command 'nc': No su

Initial findings

  • Seems like nc probably isn't installed on the machine. Potentially need to install it to the base image?
  • What's calling nc?
  • Check system health
    • systemctl list-units | grep failed
    • One failed unit! oem-cloudinit (customdata)
-- Logs begin at Wed 2019-03-27 03:48:26 UTC, end at Wed 2019-03-27 04:09:00 UTC. --
Mar 27 03:48:43 localhost systemd[1]: Starting Cloudinit from Azure metadata...
Mar 27 03:48:44 localhost coreos-cloudinit[834]: 2019/03/27 03:48:44 Checking availability of "waagent"
Mar 27 03:48:44 localhost coreos-cloudinit[834]: 2019/03/27 03:48:44 Checking availability of "waagent"
Mar 27 03:48:45 localhost coreos-cloudinit[834]: 2019/03/27 03:48:45 Checking availability of "waagent"
Mar 27 03:48:45 localhost coreos-cloudinit[834]: 2019/03/27 03:48:45 Checking availability of "waagent"
Mar 27 03:48:46 localhost coreos-cloudinit[834]: 2019/03/27 03:48:46 Checking availability of "waagent"
Mar 27 03:48:47 localhost coreos-cloudinit[834]: 2019/03/27 03:48:47 Checking availability of "waagent"
Mar 27 03:48:51 localhost coreos-cloudinit[834]: 2019/03/27 03:48:51 Checking availability of "waagent"
Mar 27 03:48:57 localhost coreos-cloudinit[834]: 2019/03/27 03:48:57 Checking availability of "waagent"
Mar 27 03:49:10 k8s-master-86009847-0 coreos-cloudinit[834]: 2019/03/27 03:49:10 Checking availability of "waagent"
Mar 27 03:49:10 k8s-master-86009847-0 coreos-cloudinit[834]: 2019/03/27 03:49:10 Fetching user-data from datasource of type "waagent"
Mar 27 03:49:10 k8s-master-86009847-0 coreos-cloudinit[834]: 2019/03/27 03:49:10 Attempting to read from "/var/lib/waagent/CustomData"
Mar 27 03:49:10 k8s-master-86009847-0 coreos-cloudinit[834]: 2019/03/27 03:49:10 line 50: error: file cannot be written to a read-only filesystem
Mar 27 03:49:10 k8s-master-86009847-0 coreos-cloudinit[834]: 2019/03/27 03:49:10 Fetching meta-data from datasource of type "waagent"
Mar 27 03:49:10 k8s-master-86009847-0 coreos-cloudinit[834]: 2019/03/27 03:49:10 Attempting to read from "/var/lib/waagent/SharedConfig.xml"
Mar 27 03:49:10 k8s-master-86009847-0 coreos-cloudinit[834]: 2019/03/27 03:49:10 Parsing user-data as cloud-config
Mar 27 03:49:10 k8s-master-86009847-0 coreos-cloudinit[834]: 2019/03/27 03:49:10 Merging cloud-config from meta-data and user-data
Mar 27 03:49:10 k8s-master-86009847-0 coreos-cloudinit[834]: 2019/03/27 03:49:10 Writing file to "/opt/azure/containers/provision_source.sh"
Mar 27 03:49:10 k8s-master-86009847-0 coreos-cloudinit[834]: 2019/03/27 03:49:10 Wrote file to "/opt/azure/containers/provision_source.sh"
Mar 27 03:49:10 k8s-master-86009847-0 coreos-cloudinit[834]: 2019/03/27 03:49:10 Wrote file /opt/azure/containers/provision_source.sh to filesystem
Mar 27 03:49:10 k8s-master-86009847-0 coreos-cloudinit[834]: 2019/03/27 03:49:10 Writing file to "/opt/azure/containers/provision.sh"
Mar 27 03:49:10 k8s-master-86009847-0 coreos-cloudinit[834]: 2019/03/27 03:49:10 Wrote file to "/opt/azure/containers/provision.sh"
Mar 27 03:49:10 k8s-master-86009847-0 coreos-cloudinit[834]: 2019/03/27 03:49:10 Wrote file /opt/azure/containers/provision.sh to filesystem
Mar 27 03:49:10 k8s-master-86009847-0 coreos-cloudinit[834]: 2019/03/27 03:49:10 Writing file to "/opt/azure/containers/provision_installs.sh"
Mar 27 03:49:10 k8s-master-86009847-0 coreos-cloudinit[834]: 2019/03/27 03:49:10 Wrote file to "/opt/azure/containers/provision_installs.sh"
Mar 27 03:49:10 k8s-master-86009847-0 coreos-cloudinit[834]: 2019/03/27 03:49:10 Wrote file /opt/azure/containers/provision_installs.sh to filesystem
Mar 27 03:49:10 k8s-master-86009847-0 coreos-cloudinit[834]: 2019/03/27 03:49:10 Writing file to "/opt/azure/containers/provision_configs.sh"
Mar 27 03:49:10 k8s-master-86009847-0 coreos-cloudinit[834]: 2019/03/27 03:49:10 Wrote file to "/opt/azure/containers/provision_configs.sh"
Mar 27 03:49:10 k8s-master-86009847-0 coreos-cloudinit[834]: 2019/03/27 03:49:10 Wrote file /opt/azure/containers/provision_configs.sh to filesystem
Mar 27 03:49:10 k8s-master-86009847-0 coreos-cloudinit[834]: 2019/03/27 03:49:10 Writing file to "/etc/ssh/sshd_config"
Mar 27 03:49:10 k8s-master-86009847-0 coreos-cloudinit[834]: 2019/03/27 03:49:10 Wrote file to "/etc/ssh/sshd_config"
Mar 27 03:49:10 k8s-master-86009847-0 coreos-cloudinit[834]: 2019/03/27 03:49:10 Wrote file /etc/ssh/sshd_config to filesystem
Mar 27 03:49:10 k8s-master-86009847-0 coreos-cloudinit[834]: 2019/03/27 03:49:10 Writing file to "/etc/systemd/system.conf"
Mar 27 03:49:10 k8s-master-86009847-0 coreos-cloudinit[834]: 2019/03/27 03:49:10 Wrote file to "/etc/systemd/system.conf"
Mar 27 03:49:10 k8s-master-86009847-0 coreos-cloudinit[834]: 2019/03/27 03:49:10 Wrote file /etc/systemd/system.conf to filesystem
Mar 27 03:49:10 k8s-master-86009847-0 coreos-cloudinit[834]: 2019/03/27 03:49:10 Writing file to "/usr/local/bin/health-monitor.sh"
Mar 27 03:49:10 k8s-master-86009847-0 coreos-cloudinit[834]: 2019/03/27 03:49:10 Failed to apply cloud-config: open /usr/local/bin/cloudinit-temp114659945: read-only file system
Mar 27 03:49:10 k8s-master-86009847-0 systemd[1]: oem-cloudinit.service: Main process exited, code=exited, status=1/FAILURE
Mar 27 03:49:10 k8s-master-86009847-0 systemd[1]: oem-cloudinit.service: Failed with result 'exit-code'.
Mar 27 03:49:10 k8s-master-86009847-0 systemd[1]: Failed to start Cloudinit from Azure metadata.

oem-cloudinit.service

ncat vs. nc

Second attempt

I recreated the cluster using the same API model and command after making the ncat and /usr/local/bin/health-monitor.sh. This time cloud-init wrote everything out successfully, but cluster-provision.log contained many failures.

etcd location

  • Etcd was tar -xzvf to /usr/bin/. Moved it to /opt/bin/ on cos.
  • Update value in /etc/systemd/system/etcd.service to reflect change (kubernetesmastercustomdata.yml)

hostname

  • hostname -I doesn't work on cos, use -i.

curl timeouts

  • When using cos, the provisioning step tries to set up apt and curl some files to configure it. Disable this.

Don't install things with apt

  • Disabled installing container runtime (uses apt) when running on cos

Switch PRIVATE_IP from IPv6 to IPv4

  • hostname -i returns ipv6 address, which etcdctl will not recognize for member updates.

Don't wait on the docker shared mounts file

  • Apparently some edge case with cos docker build, didn't look too deep: moby/moby#31615
  • This also affects rpc-statd service, which will not enable or start under coreos.

Add Requires=rpc-statd.service to kubelet.service

  • This fixes the issue with rpc-statd not enabling/starting successfully.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment