This procedure is to work around a stuck upgrade where there are failing pods and the reconciler fails.
The problem you may experience is a whereabouts reconciler pod in a crashloopbackoff state with an error message like:
[error] could not create the pod networks controller: Could not find node with node name ''.: resource name may not be empty
This is likely caused by a missing environment variable, as fixed in: openshift/cluster-network-operator#1829
The gist of this procedure is to:
- Disable the whereabouts reconciler temporarily by removing them from the cluster-network-operator (CNO)
Networks
object. - Recreate net-attach-defs by hand (so they're still usable)
In brief, the whereabouts reconciler is "opt-in" and you opt-in to using it by having net-attach-defs in the Networks
object that reference whereabouts.
If those are removed, the reconciler will not launch, and therefore, the upgrade should succeed.
We follow up on this by creating the net-attach-defs by hand (as opposed to the CNO Networks
), which will keep us in an "opt-out" state, so they are still usable.
NOTE: THIS NEEDS AN UPDATE TO RE-ENABLE THE RECONCILER, which is still a to-do
First edit the networks object with:
oc edit networks.operator.openshift.io cluster
I recommend that you save an entire copy of this. However, you can just save the additionalNetworks
section.
Look for the additionalNetworks
section, like so:
spec:
additionalNetworks:
- name: ipvlan-dynamic
namespace: example-item-ns
rawCNIConfig: '{ "cniVersion": "0.3.1", "name": "ipvlan-dynamic", "type": "ipvlan",
"mode": "l2", "master": "bond1", "ipam": { "type": "whereabouts", "range": "[removed]::/64",
"range_start": "[removed]:4438:39ff:feff:1101", "range_end": "[removed]:4438:39ff:feff:2125"
} }'
type: Raw
- name: ipvlan-static
namespace: example-item-ns
rawCNIConfig: '{ "cniVersion": "0.3.1", "name": "ipvlan-static", "type": "ipvlan",
"mode": "l2", "master": "bond1", "ipam": { "type": "static" } }'
type: Raw
- name: ipvlan-dynamic
namespace: openshift-monitoring
rawCNIConfig: '{ "cniVersion": "0.3.1", "name": "ipvlan-dynamic", "type": "ipvlan",
"mode": "l2", "master": "bond1", "ipam": { "type": "whereabouts", "range": "[removed]::/64",
"range_start": "[removed]:4438:39ff:feff:3101", "range_end": "[removed]:4438:39ff:feff:3125"
} }'
type: Raw
- name: daemon-network
namespace: example-item-ns
rawCNIConfig: '{"cniVersion": "0.3.1","name": "daemon-network","type": "ipvlan","mode":
"l2","master": "bond1.3201","mtu": 1500,"ipam": {"type": "static"}}'
type: Raw
- name: daemon-network
namespace: example-scale
rawCNIConfig: '{ "cniVersion": "0.3.1", "name": "daemon-network", "type": "ipvlan",
"mode": "l2", "master": "bond1.3201", "ipam": { "type": "static" } }'
type: Raw
IMPORTANT: SAVE THESE ITEMS BEFORE YOU REMOVE THEM
Remove any items from the additionalNetworks
section that contain "type": "whereabouts"
which in this case is two items ipvlan-dynamic
in two namespaces (items #1 and #3)
This would leave you with a section that reads:
spec:
additionalNetworks:
- name: ipvlan-static
namespace: example-item-ns
rawCNIConfig: '{ "cniVersion": "0.3.1", "name": "ipvlan-static", "type": "ipvlan",
"mode": "l2", "master": "bond1", "ipam": { "type": "static" } }'
type: Raw
- name: daemon-network
namespace: example-item-ns
rawCNIConfig: '{"cniVersion": "0.3.1","name": "daemon-network","type": "ipvlan","mode":
"l2","master": "bond1.3201","mtu": 1500,"ipam": {"type": "static"}}'
type: Raw
- name: daemon-network
namespace: example-scale
rawCNIConfig: '{ "cniVersion": "0.3.1", "name": "daemon-network", "type": "ipvlan",
"mode": "l2", "master": "bond1.3201", "ipam": { "type": "static" } }'
type: Raw
Save that file.
The CNO should now process this change and remove the reconciler pods.
You do not need to wait for the pods to be removed to continue.
We will now transpose these items into net-attach-def CRD yaml. You need three parts: The name, the namespace, and the rawCNIConfig JSON.
A resulting yaml file will look like:
---
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
name: ipvlan-dynamic
namespace: example-item-ns
spec:
config: '{ "cniVersion": "0.3.1", "name": "ipvlan-dynamic", "type": "ipvlan",
"mode": "l2", "master": "bond1", "ipam": { "type": "whereabouts", "range": "[removed]::/64",
"range_start": "[removed]:4438:39ff:feff:1101", "range_end": "[removed]:4438:39ff:feff:2125"
} }'
---
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
name: ipvlan-dynamic
namespace: openshift-monitoring
spec:
config: '{ "cniVersion": "0.3.1", "name": "ipvlan-dynamic", "type": "ipvlan",
"mode": "l2", "master": "bond1", "ipam": { "type": "whereabouts", "range": "[removed]::/64",
"range_start": "[removed]:4438:39ff:feff:3101", "range_end": "[removed]:4438:39ff:feff:3125"
} }'
Save this as my-net-attach-defs.yml
and then issue:
oc create -f my-net-attach-defs.yml
This is a to-do. This process should get an upgrade "un-stuck" as it is.