Skip to content

Instantly share code, notes, and snippets.

@acsulli
Last active November 18, 2019 23:23
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save acsulli/09db964303dbc1b9beef5cbb5d7c0d1b to your computer and use it in GitHub Desktop.
Save acsulli/09db964303dbc1b9beef5cbb5d7c0d1b to your computer and use it in GitHub Desktop.

A cluster admin perspective on OpenShift Cluster Operators

OpenShift 4.x relies heavily upon operators for automating all aspects of the services. Discovering which operator controls which service can be difficult and confusing, but there are some steps we can take to help make the process easier and make educated guesses.

This document focuses on OpenShift core services, not add-on operators. While much, if not all, of the information still applys to those types of operators, they are much easier to see, manage, and inflect due to their nature. Be sure to see and read about the Operator Framework for the most up to date information about them.

Some prerequesites:

  1. It's important to understand the concept of operators and how they work
  2. First level operator(s), i.e. the Cluster Version Operator (CVO)
  3. Second level operator(s), i.e. the other OpenShift services

Common questions

  • What operators exist? What is the status of the operators? Are they healthy?

    When the cluster-level operators instantiate they register status using clusteroperators.config.openshift.io.

    # get the list of deployed cluster (not OLM) operators and their status
    oc get clusteroperators.config.openshift.io
    

    Sample output:

    NAME                                    VERSION                  AVAILABLE   PROGRESSING   FAILING   SINCE
    cluster-autoscaler-operator             v4.0.0-0.150.0.0-dirty   True        False         False     1d
    cluster-image-registry-operator         v4.0.0-0.150.0.0-dirty   True        False         False     1d
    cluster-monitoring-operator                                      True        True          False     1d
    cluster-node-tuning-operator                                     True        False         False     1d
    kube-controller-manager                                          True        False         False     1d
    machine-api-operator                    v4.0.0-0.150.0.0-dirty   True        False         False     3m
    machine-config-operator                 4.0.0-0.150.0.0-dirty    True        False         False     17s
    openshift-apiserver                                              True        False         False     10m
    openshift-cluster-samples-operator                               False       True          True      1m
    openshift-controller-manager-operator                            True        False         False     1d
    openshift-dns-operator                  0.0.1                    True        False         False     1d
    openshift-ingress-operator              0.0.1                    True        False         False     1d
    openshift-kube-apiserver-operator                                True        False         False     1d
    openshift-kube-scheduler-operator                                True        False         False     1d
    openshift-network-operator                                       True        False         False     1d
    operator-lifecycle-manager                                       True        False         False     1d
    

    We can view the status of a specific operator by describing it...

    # get the status of the cluster operator
    oc describe clusteroperators.config.openshift.io $OPERATOR
    
  • What operators should/might/could exist?

    # view the release software, along with the source repo, for the cluster version
    oc adm release info --commits | grep operator
    

    This command is particularly useful for finding out more about the operators. Most operators have a dedicated repository on GitHub which will have additional information about configuration, management, troubleshooting, etc.

  • How do I see a dependency tree for operators?

    Core operators do not have a mapped dependency tree. The repo for each operator will have a manifests directory which contains the objects created for the service and will most likely hold clues about services which it is relying on.

    For Operator Lifecycle Manager (OLM) managed operators, check the Cluster Service Version (CSV) definition for the operator.

Finding the operator for what you care about

At this time, to my knowledge, there is no publicly documented list of cluster services which are provided by operators beyond what's described above. Unfortunatley, the best way I have found is to leverage experience and intuition to make an educated guess about what the operator may be called, then start searching.

Let's assume I want to know about the image registry used by the OpenShift cluster. What are some steps we can take to discover more?

  1. Check the release for a relevant operator(s)

    oc adm release info --commits | grep operator
    

    Since we're looking for the image registry, it may be reasonable to add something like grep image or grep registry to the end of the above command. However, the list isn't unreasonably long, so it's possible to simply scan it manually to search for additional crumbs of information.

    If you find something above, great! Go check out the repo specified to find out much more information about the operator.

  2. Look for a CRD which matches the functionality

    oc get crd | grep -E 'image|registry'
    

    This can be helpful for discovering the resources which are created by the operator. For the above command the following is output:

    configs.imageregistry.operator.openshift.io                                                  2019-02-13T14:46:13Z
    images.config.openshift.io                                                                   2019-02-13T14:46:14Z
    

    CRDs are like any other object and can be described. This can provide valuable information going forward as well.

    oc describe crd configs.imageregistry.operator.openshift.io
    

    And the resulting output:

    Name:         configs.imageregistry.operator.openshift.io
    Namespace:
    Labels:       <none>
    Annotations:  <none>
    API Version:  apiextensions.k8s.io/v1beta1
    Kind:         CustomResourceDefinition
    Metadata:
      Creation Timestamp:  2019-02-13T14:46:13Z
      Generation:          1
      Resource Version:    7517
      Self Link:           /apis/apiextensions.k8s.io/v1beta1/customresourcedefinitions/configs.imageregistry.operator.openshift.io
      UID:                 1bda80c5-2f9e-11e9-9646-12c303c76f5e
    Spec:
      Additional Printer Columns:
        JSON Path:    .metadata.creationTimestamp
        Description:  CreationTimestamp is a timestamp representing the server time when this object was created. It is not guaranteed to be set in happens-before order across separate operations. Clients may not set this value. It is represented in RFC3339 form and is in UTC.
    
    Populated by the system. Read-only. Null for lists. More info: https://git.k8s.io/community/contributors/devel/api-conventions.md#metadata
        Name:  Age
        Type:  date
      Group:   imageregistry.operator.openshift.io
      Names:
        Kind:       Config
        List Kind:  ConfigList
        Plural:     configs
        Singular:   config
      Scope:        Cluster
      Version:      v1
      Versions:
        Name:     v1
        Served:   true
        Storage:  true
    Status:
      Accepted Names:
        Kind:       Config
        List Kind:  ConfigList
        Plural:     configs
        Singular:   config
      Conditions:
        Last Transition Time:  2019-02-13T14:46:13Z
        Message:               no conflicts found
        Reason:                NoConflicts
        Status:                True
        Type:                  NamesAccepted
        Last Transition Time:  <nil>
        Message:               the initial names have been accepted
        Reason:                InitialNamesAccepted
        Status:                True
        Type:                  Established
      Stored Versions:
        v1
    Events:  <none>
    

    Some interesting information includes the names (Status -> Accepted Names)

  3. Assuming something was found, explore the likely CRDs

    This is more art than science, but we can follow a pattern for attempting to discover resources.

    # check for instances of the object in any namespace/project
    oc get configs.imageregistry.operator.openshift.io --all-namespaces
    

    This has the potential to output three things:

    • Nothing
    • More than one instance which may or may not be namespaced
    • One instance, which may or may not be namespaced

    For my cluster (4.0.0-0.3), the following is returned:

    NAME       AGE
    instance   1d
    

    Looking at this, notice that there is no namespace column. This indicates that it is a global object. Some objects will have a namespace, it may be useful to explore that project and it's objects.

  4. Check the contents of the object which was found

    The object, generally, has a lot of information about the operator and things it is managing.

    # don't forget to add the namespace if your object is namespaced
    oc get configs.imageregistry.operator.openshift.io/instance -o yaml
    

    For the image registry, this happens to print out a bunch of information, including configuration options and status.

  5. Check for projects which match the key terms in the operator

    Like with the CRD, there are often projects which hold components of the operator.

    oc get project | grep image
    

    Which results in:

    openshift-image-registry                                    Active
    
  6. Explore potential projects

    Check for ConfigMap, Secret, Deployment, Pod, and other objects which are related to the service you're interested in.

    # move to the project
    oc project openshift-image-registry
    
    # look for objects
    oc get pod
    oc get deployment
    oc get configmap
    
    # check logs and other info about them
    oc logs $POD_NAME
    oc describe $POD
    oc describe $DEPLOYMENT
    
@rmahroua
Copy link

Is there more documentation available regarding first-level and second-level operators? I have found conflicted information about the cluster version operator - some documentation mentions that this is a second-level operator, whereas you are indicating that this is a first-level operator.
Thanks for your help :)

@flozanorht
Copy link

AFAIK first-level and second-level operators are concepts related to the OpenShift 4 cluster version operator and its "dependent" operators (the cluster operators). These concepts make sense when installing and upgrading an OpenShift 4 cluster. They are not concepts from the Operator SDK nor the OLM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment