This process uses openshift-operators
as the namespace. To use an alternate namespace, replace it's necessary to create an operatorgroup before installing operators:
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: dwo-test
-
Install DWO release catalog and create a subscription to DWO at an earlier version to allow an upgrade to take place
cat <<EOF | oc apply -f - apiVersion: operators.coreos.com/v1alpha1 kind: CatalogSource metadata: name: devworkspace-operator-catalog namespace: openshift-operators spec: sourceType: grpc image: quay.io/devfile/devworkspace-operator-index:release publisher: Red Hat displayName: DevWorkspace Operator Catalog updateStrategy: registryPoll: interval: 5m --- apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: devworkspace-operator namespace: openshift-operators spec: channel: fast name: devworkspace-operator source: devworkspace-operator-catalog sourceNamespace: openshift-operators installPlanApproval: Manual startingCSV: devworkspace-operator.v0.15.1 EOF
-
Install the
next
Che Operatorcat <<EOF | oc apply -f - apiVersion: operators.coreos.com/v1alpha1 kind: CatalogSource metadata: name: che-next-operator-catalog namespace: openshift-operators spec: image: 'quay.io/eclipse/eclipse-che-openshift-opm-catalog:next' sourceType: grpc updateStrategy: registryPoll: interval: 5m --- apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: che-operator namespace: openshift-operators spec: channel: next installPlanApproval: Automatic name: eclipse-che-preview-openshift source: che-next-operator-catalog sourceNamespace: openshift-operators startingCSV: eclipse-che-preview-openshift.v7.52.0-639.next EOF
-
Since the installPlanApproval for DWO is "Manual" approve the initial install via e.g. OperatorHub. Once the inital install is complete, both Operators should have updates available.
-
Create instances of CheCluster and DevWorkspace CRs, as those are impacted by conversion webhooks
oc new-project eclipse-che cat <<EOF | oc apply -f - apiVersion: org.eclipse.che/v2 kind: CheCluster metadata: name: eclipse-che namespace: eclipse-che spec: {} EOF oc apply -f https://raw.githubusercontent.com/devfile/devworkspace-operator/main/samples/plain.yaml -n eclipse-che
-
Break conversion webhooks for DevWorkspaces in a similar way to what happens when Che attempts to deploy DWO:
oc patch crds devworkspaces.workspace.devfile.io --patch ' metadata: annotations: service.beta.openshift.io/inject-cabundle: "true" spec: conversion: webhook: clientConfig: caBundle: Cg== # "" in base64 encoding '
-
Verify that conversion webhooks for DWO are broken
oc get devworkspaces.v1alpha1.workspace.devfile.io --all-namespaces Error from server: conversion webhook for workspace.devfile.io/v1alpha2, Kind=DevWorkspace failed: Post "https://devworkspace-controller-manager-service.openshift-operators.svc:443/convert?timeout=30s": x509: certificate signed by unknown authority
-
Allow the update for DevWorkspace Operator in OperatorHub (should attempt to install DWO v0.15.2 and Che eclipse-che-preview-openshift.v7.52.0-639.next). This update will get stuck
Congratulations, your cluster is broken.
Note: If this issue is encountered naturally (i.e. Dev Spaces / Eclipse Che attempts to install DevWorkspaces from YAML), delete the devworkspace-controller
namespace. In this case, it is also necessary to apply the process below to the devworkspacetemplates.workspace.devfile.io
CRD
-
Remove the
inject-cabundle
annotation from DevWorkspaces CRDoc annotate crd devworkspaces.workspace.devfile.io service.beta.openshift.io/inject-cabundle-
This is necessary as leaving it in place will overwrite the changes below.
-
Restore caBundle in CRD to match what is served by DevWorkspace Operator install
# Get serving cert CA_BUNDLE=$(oc get secret -n openshift-operators devworkspace-controller-manager-service-cert -o json | jq -r '.data.olmCAKey') # Set correct caBundle and service reference oc patch crds devworkspaces.workspace.devfile.io --patch " spec: conversion: webhook: clientConfig: caBundle: ${CA_BUNDLE} service: name: devworkspace-controller-manager-service namespace: openshift-operators path: /convert port: 443 "
-
Verify that conversion webhooks for DWO are restored -- the below command should not show an error:
oc get devworkspaces.v1alpha1.workspace.devfile.io --all-namespaces W0804 16:45:34.843990 87674 warnings.go:70] workspace.devfile.io/v1alpha1 DevWorkspace is deprecated; use workspace.devfile.io/v1alpha2 DevWorkspace NAMESPACE NAME WORKSPACE ID PHASE URL eclipse-che plain-devworkspace workspace9b4ca9ef26124f9d Running
The above process should result in breaking the OLM install of DWO as well. In some cases (e.g. if DWO is not updated after conversion webhooks are broken) this may not be the case, making this step and later steps unnecessary.
To check if your cluster has this problem, typical signs are:
- There are multiple CSVs for DWO in the openshift-operators namespace, with one having phase "Pending" and the other "Replacing"
❯ oc get csv | grep devworkspace devworkspace-operator.v0.15.1 DevWorkspace Operator 0.15.1 devworkspace-operator.v0.15.0 Replacing devworkspace-operator.v0.15.2 DevWorkspace Operator 0.15.2 devworkspace-operator.v0.15.1 Pending
- In the "Installed Operators" UI, multiple entries for DWO or one entry that is stuck in a loop of
Installing
andInstallSucceeded
- Inability to get new updates for DWO
To fix this, it's nececssary to uninstall the operator but leave all workloads in place. If the devworkspace controller deployment is removed, nothing will be available to serve webhooks. Follow these steps carefully
-
Apply a temporary clusterrole and clusterrolebinding to allow the DevWorkspace deployment to continue to run while we reinstall the subscription and CSVs (the link below should point at the other file in this gist)
oc apply -f https://gist.githubusercontent.com/amisevsk/1e8b14f7dcaa50727d35144133394747/raw/e53301b3f623bdeb2f3c6deaef9130640ab8916a/dwo-clusterrole-and-binding.yaml
This clusterrole and clusterrolebinding are identical to the one supplied by OLM for the operator. OLM will delete the existing clusterrole and binding when the CSV is deleted.
-
Delete the DevWorkspace Operator subscription and any CSVs created for it. Note: the
--cascade=orphan
option is required to avoid deleting the controller deployment and service.oc delete sub devworkspace-operator --cascade=orphan for csv in $(oc get csvs | grep devworkspace | cut -f 1 -d ' '); do oc delete csv $csv --cascade=orphan done
-
Verify that conversion webhooks are still functional (if not, see steps above to repair) -- the command should not show a TLS or service unavailable error
oc get devworkspaces.v1alpha1.workspace.devfile.io --all-namespaces W0804 16:45:34.843990 87674 warnings.go:70] workspace.devfile.io/v1alpha1 DevWorkspace is deprecated; use workspace.devfile.io/v1alpha2 DevWorkspace NAMESPACE NAME WORKSPACE ID PHASE URL eclipse-che plain-devworkspace workspace9b4ca9ef26124f9d Running
-
Re-create the DWO subscription
cat <<EOF | oc apply -f - apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: devworkspace-operator namespace: openshift-operators spec: channel: fast name: devworkspace-operator source: devworkspace-operator-catalog sourceNamespace: openshift-operators installPlanApproval: Manual startingCSV: devworkspace-operator.v0.15.1 EOF
-
Approve the manual install (or let it install automatically) and ensure it succeeds, CSVs are as expected, and install functions normally. A successful update after installation is a good sign :)
-
Delete temporary clusterroles created in step 1
oc delete clusterrole devworkspace-temporary-clusterrole oc delete clusterrolebinding devworkspace-temporary-clusterrolebinding
In one case while following the process above, the Che Operator OLM installation was also broken. The general process to fix this should be the same as for fixing the DWO installation above (you may need the correct clusterrole and binding for the Che Operator instead of for DWO), but full documentation hasn't been tested.
To test that Che Operator conversion webhooks are working, you can use the command
❯ oc get checlusters.v1.org.eclipse.che --all-namespaces
W0804 17:04:58.752317 91479 warnings.go:70] org.eclipse.che/v1 CheCluster is deprecated and will be removed in future releases
NAMESPACE NAME AGE
eclipse-che eclipse-che 50m
Strange issues encountered during reinstalling Che Operator (to be investigated further)
-
InstallPlan fails with
InstallComponentFailedinstall strategy failed: service che-operator-service not safe to replace: extraneous ownerreferences found
- Fix: Manually add correct ownerref to Che Operator service (???)
# Generate ownerref patch for the service SVC_PATCH=$(oc get csv -o yaml | yq -y '.items[] | select(.metadata.name | match("eclipse-che*")) | { "metadata": { "ownerReferences": [{ "apiVersion": "operators.coreos.com/v1alpha1", "blockOwnerDeletion": false, "controller": false, "kind": "ClusterServiceVersion", "name": "eclipse-che-preview-openshift.v7.52.0-642.next", "uid": .metadata.uid, }] } }') oc patch svc che-operator-service -n openshift-operators --patch "$SVC_PATCH"
- Fix: Manually add correct ownerref to Che Operator service (???)
-
Service
che-operator-service
is deleted somehow- Fix: Recreate it (this is liable to break, it's best to save the service from the cluster before attempting anything)
cat <<EOF | oc apply -f - kind: Service apiVersion: v1 metadata: name: che-operator-service namespace: openshift-operators labels: operators.coreos.com/eclipse-che-preview-openshift.openshift-operators: '' spec: ports: - name: '443' protocol: TCP port: 443 targetPort: 9443 type: ClusterIP selector: app: che-operator EOF
- Fix: Recreate it (this is liable to break, it's best to save the service from the cluster before attempting anything)
-
Pod cannot be created for Che Operator controller, with CreateContainerSpecError
- Fix: Remove
securityContext
from che-operator deployment podSpec (???)
- Fix: Remove
-
Conversion webhooks are broken by install, with no caBundle and service namespace
eclipse-che
- fix conversion webhooks as for DWO:
# Get serving cert CA_BUNDLE=$(oc get secret -n openshift-operators che-operator-service-cert -o json | jq -r '.data.olmCAKey') # Set correct caBundle and service reference oc patch crds checlusters.org.eclipse.che --patch " spec: conversion: webhook: clientConfig: caBundle: ${CA_BUNDLE} service: name: che-operator-service namespace: openshift-operators path: /convert port: 443 "
- fix conversion webhooks as for DWO: