darkowlzz/results-of-reconciliation.md

## results-of-reconciliation.md

      
    Raw
  

              results-of-reconciliation.md
            
          
    Results of Reconciliation

In a controller, a reconciliation loop execution performs some domain specific
operations to eliminate any drift in the declared configuration and the state
of the world. Usually, the result of the core operations of the reconciliation
is directly a change in the world based on the declared configuration. Other
results of the reconciliation are the reported status on the object,
ctrl.Result value and any error during the reconciliation. The ctrl.Result
and error are runtime results, and the reported status is an API change result
on the target object. Based on the controller-patterns(flux) document, the core
operations are handled in the sub-reconcilers. The ctrl.Result, error and
status API are handled in a deferred function called summarizeAndPatch().
This document describes in detail about how these three types of results are
formulated and how they affect one another. It describes a generic model for
computing these results independent of the domain of a reconciler. It takes
into consideration the kstatus standards for reporting the status of objects
using status conditions. However, this model can be used to create similar
models with other standards in other domains.
        ┌────────────────────────────────────────────────────────────────────┐
        │                             Runtime                                │
        │           ┌──────────────────────────────────────────────────────┐ │
        │           │                Reconciler                            │ │
        │           │                                                      │ │
        │ Reconcile │          ┌─────────────────┐                         │ │
 Event  │  Request  │  object  │ Domain specific │ Intermediate expression │ │
───────►├──────────►│─────────►│    operations   ├──────────┐of results    │ │
   1    │    2      │     3    │       4         │       5  │              │ │
        │           │          └─────────────────┘          │              │ │
        │           │                                       ▼              │ │
        │           │           Runtime results     ┌────────────────┐     │ │
        │           │        (ctrl.Result + error)  │  Final result  │     │ │
        │           │◄──────────────────────────────┤   computation  │     │ │
        │           │      Update object status API │       6        │     │ │
        │           │                 7             └────────────────┘     │ │
        │           └──────────────────────────────────────────────────────┘ │
        │                                                                    │
        └────────────────────────────────────────────────────────────────────┘
As described in the controller-patterns document, the results of reconciliation
can be abstracted into three constants: ResultEmpty, ResultRequeue and
ResultSuccess, and a contextual error to express the result of reconciliation
more clearly. The expressed results in these forms are used as input to the
generic model that computes the final result which are then used to patched the
object status and returned to the runtime.
We'll go through the details about how the final result is computed and then
discuss about the user-facing APIs to tune these result computation based on
the needs of a reconciler.
Runtime Result (ctrl.Result)


In the result abstraction, unlike ResultEmpty and ResultRequeue, which have
clear meaning of what they are, ResultSuccess may have a different meaning in
different domains. For a reconciler that always reconciles at some particular
interval of time, ResultSuccess is ctrl.Result with RequeueAfter value as
the requeue interval. However, for a reconciler that reconciles only when there's an
event related to the objects it watches, the ResultSuccess is an empty
ctrl.Result value, which is equivalent to ResultEmpty. In this case,
although ResultSuccess and ResultEmpty have the same underlying value, they
have different meanings. ResultEmpty doesn't mean that the reconciliation was
successful. It can be returned along with a failure error, or a stalling error.
But ResultSuccess is explicitly used to indicate that the reconciler has
succeeded in its operations.
The BuildRuntimeResult() introduced in the controller-patterns document is an
example of converting these intermediate results into runtime results. To make
it customizable, we can define an RuntimeResultBuilder interface that can be
used to implement custom result conversion.
// RuntimeResultBuilder defines an interface for runtime result builders. This
// can be implemented to build custom results based on the context of the
// reconciler.
type RuntimeResultBuilder interface {
	BuildRuntimeResult(rr Result, err error) ctrl.Result
}
In the above, Result is the abstracted result and error is the reconciliation
error. Based on the domain, if an error affects the returned runtime result,
this can be specified in a custom RuntimeResultBuilder implementation. For example, in the
case of reconcilers that always requeue at a specific period, when there's a
waiting error which indicates that the reconciler should wait for some period
of time before retrying again, the RuntimeResultBuilder implementation would
look like:
// AlwaysRequeueResultBuilder implements a RuntimeResultBuilder for always
// requeuing reconcilers. A successful reconciliation result for such
// reconcilers contains a fixed RequeueAfter value.
type AlwaysRequeueResultBuilder struct {
	// RequeueAfter is the fixed period at which the reconciler requeues on
	// successful execution.
	RequeueAfter time.Duration
}

// BuildRuntimeResult converts a given Result and error into the
// return result of a controller's Reconcile function.
func (r AlwaysRequeueResultBuilder) BuildRuntimeResult(rr Result, err error) ctrl.Result {
	// Handle special errors that contribute to expressing the result.
	if e, ok := err.(*serror.Waiting); ok {
		return ctrl.Result{RequeueAfter: e.RequeueAfter}
	}

	switch rr {
	case ResultRequeue:
		return ctrl.Result{Requeue: true}
	case ResultSuccess:
		return ctrl.Result{RequeueAfter: r.RequeueAfter}
	default:
		return ctrl.Result{}
	}
}
This shows how the runtime result, ctrl.Result, can be computed, taking into
consideration the domain specific factors.
ComputeReconcileResult

ComputeReconcileResult() was introduces in the controller-patterns document
to consolidate all the computation of results. It can use the
RuntimeResultBuilder described above to compute the ctrl.Result.
For computing the runtime error and status conditions, we need to consider
the kstatus conditions, particularly the Reconciling and Stalled
conditions. They are dependent on the Result and error of reconciliation and
also affect the runtime error that's computed.
Reconciling status

When the reconciler has detected a drift in the declared configuration and the
state of the world, a Reconciling status can be added on the object while the
reconciler is working on eliminating the drift. An example of this would be a
new configuration. When a new object generation is observed, the reconciler can
add a Reconciling status condition on the object (not persisted in the API
server yet, only in memory). By the end of the loop, if the reconciliation was
successful, the Reconciling status can be removed. But if the reconciliation
wasn't successful in eliminating the drift, maybe the reconciler needs to wait
and retry or encountered an error, the Reconciling status remains on the
object status across the reconciliation loop runs. In this case, the status
condition value is persisted in the API server at the end of a reconciliation.
In this scenario, the reconciliation result and the error affect the object
status API. The following information are needed to compute the results:

Was the reconciliation successful?
Did the reconciliation fail due to some error?
Was the reconciliation unsuccessful due to some unmet preconditions and it
may be resolved with some retries?

Using the result abstraction, we can determine the reconciling status condition
and update the object as:
// Remove reconciling condition on successful reconciliation.
if recErr == nil && res == ResultSuccess {
	conditions.Delete(obj, meta.ReconcilingCondition)
}
where recErr is the reconciliation error and res is the abstracted result.
The abstracted result makes a clear distinction between a successful result,
empty result and immediate requeue result. When the reconciliation error is nil
and the reconciliation was successful, the Reconciling condition can be
removed.
Note that, the Reconciling condition is added by the core operations code, in
this section, we only evaluate if it should stay or be removed.
If the res value was ResultRequeue, which means we need to retry,
Reconciling condition should not be removed.
In other words, the Reconciling condition can be removed when the
reconciliation was successful and there's no error.
Stalled status

When the reconciler detects a configuration that can't be used to reconcile
successfully, even if retried, it can enter into a Stalled state. This state
requires a human intervention to fix the provided configuration. The error
result in this situation would be a stalling error.
Reconciling and Stalled status conditions are mutually exclusive.
Reconciling exists when there's no error and requeue is requested, but
Stalled exists only when the Stalling error is encountered, along with an
empty result, ResultEmpty.
Considering these, we can analyze the error and Result as:
// Analyze the reconcile error.
switch t := recErr.(type) {
case *serror.Stalling:
	if res == ResultEmpty {
		// The current generation has been reconciled successfully and it
		// has resulted in a stalled state. Return no error to stop further
		// requeuing.
		pOpts = append(pOpts, patch.WithStatusObservedGeneration{})
		conditions.MarkStalled(obj, t.Reason, t.Error())
		return pOpts, result, nil
	}
case *serror.Waiting:
	// The reconcile resulted in waiting error, but we are not in stalled
	// state.
	conditions.Delete(obj, meta.StalledCondition)
	// The reconciler needs to wait and retry. Return no error. The result
	// contains the requeue after value.
	return pOpts, result, nil
case nil:
	// The reconcile didn't result in any error, we are not in stalled
	// state. If a requeue is requested, the current generation has not been
	// reconciled successfully.
	if res != ResultRequeue {
		pOpts = append(pOpts, patch.WithStatusObservedGeneration{})
	}
	conditions.Delete(obj, meta.StalledCondition)
default:
	// The reconcile resulted in some error, but we are not in stalled
	// state.
	conditions.Delete(obj, meta.StalledCondition)
}
Stalled status condition is added to the object when an Stalling error is
encountered with empty result.
Stalled status condition is removed when there is no error or the error is
not Stalling error.
The patch options set in variable pOpts is used to configure when to update
the status.observedGeneration of an object to indicate the status of the
object. A stalled state means that the current object generation has been
reconciled, so patch.WithStatusObservedGeneration() is set in the patch
option.
With the above, we have computed all the three results of reconciliation using
BuildRuntimeResult and the analysis of error and result. The next section
discusses about processing, summarizing and patching the object status.
Summarize and Patch

The controller-patterns document introduced a version of summarize and patch
with some details left out for simplicity. In this section, we will discuss it
in more details and with some advanced usage of SummarizeAndPatch() to be
able to configure how it functions. This version of SummarizeAndPatch() has a
different API which provides more control over the process.
The required arguments for SummarizeAndPatch() are an event recorder, a patch
helper and the target object. The event recorder could be any event recorder
that adheres to the K8s event recorder interface. The patch helper is based on
the go package github.com/fluxcd/pkg/runtime/patch, which helps patch the
final object.
It is created using a summarize and patch helper constructor.
func NewHelper(recorder kuberecorder.EventRecorder, patchHelper *patch.Helper) *Helper
The SummarizeAndPatch() method takes a context, a target object and some
helper options and returns the runtime result and error:
func (h *Helper) SummarizeAndPatch(ctx context.Context, obj conditions.Setter, options ...Option) (ctrl.Result, error)
The default behavior of SummarizeAndPatch() with only the required arguments
is only to patch the provided object.
Summarizing the Conditions

The summarization in SummarizeAndPatch() refers to the condition status
summary of the conditions of an object. Usually, when using kstatus, the
Ready condition is expected to be present. The Ready condition summary
depends on the values of other conditions. In addition to Ready, other
conditions or even any custom conditions can be summarized. For this, a new
Conditions type can be defined in the context of summarization.
// Conditions contains all the conditions information needed to summarize the
// target condition.
type Conditions struct {
	// Target is the target condition, e.g.: Ready.
	Target string
	// Owned conditions are the conditions owned by the reconciler for this
	// target condition.
	Owned []string
	// Summarize conditions are the conditions that the target condition depends
	// on.
	Summarize []string
	// NegativePolarity conditions are the conditions in Summarize with negative
	// polarity.
	NegativePolarity []string
}
An example instance of this for Ready condition looks like:
var gitRepoReadyConditions = Conditions{
	Target: meta.ReadyCondition,
	Owned: []string{
		sourcev1.SourceVerifiedCondition,
		sourcev1.FetchFailedCondition,
		sourcev1.IncludeUnavailableCondition,
		sourcev1.ArtifactOutdatedCondition,
		meta.ReadyCondition,
		meta.ReconcilingCondition,
		meta.StalledCondition,
	},
	Summarize: []string{
		sourcev1.IncludeUnavailableCondition,
		sourcev1.SourceVerifiedCondition,
		sourcev1.FetchFailedCondition,
		sourcev1.ArtifactOutdatedCondition,
		meta.StalledCondition,
		meta.ReconcilingCondition,
	},
	NegativePolarity: []string{
		sourcev1.FetchFailedCondition,
		sourcev1.IncludeUnavailableCondition,
		sourcev1.ArtifactOutdatedCondition,
		meta.StalledCondition,
		meta.ReconcilingCondition,
	},
}
In this case, the Target condition to be summarized is the Ready condition.
The Owned conditions are conditions that are relatd to this target condition
which will be patched along with the target condition. Owned is used to
configure the patch helper to resolve any conflict by making the patcher the
owner of those conditions.
The Summarize conditions are the conditions the target condition depends on.
The NegativePolarity conditions are the conditions from the summarize
conditions that have a negative polarity.
Similarly, other Conditions can be configured for computing their summary in
SummarizeAndPatch(), passed as an option. SummarizeAndPatch() iterates
through all the conditions and adds all the summaries on the object, which is
patched at the end.
Result Processing

In SummarizeAndPatch(), the final runtime result of reconciliation can be
calculated by passing the Result, error and a RuntimeResultBuilder. The
RuntimeResultBuilder is the same result builder discussed in the Runtime
Result section above. The Result and error are passed to the result builder
and ComputeReconcileResult to compute the final result as described above.
If no RuntimeResultBuilder is passed, SummarizeAndPatch() skips computing
the result. The returned results should be ignored by the caller.
In order to perform any pre-processing on the results (target object, Result and
error), before passing to the ComputeReconcileResult(), custom result
processors can be inject which are middlewares in SummarizeAndPatch(). The
result processors are defined as:
// ResultProcessor processes the results of reconciliation (the object, result
// and error). Any errors during processing need not result in the
// reconciliation failure. The errors can be recorded as logs and events.
type ResultProcessor func(context.Context, kuberecorder.EventRecorder, client.Object, reconcile.Result, error)
These result processors are useful for logging, event emitting based on the
results. They can also be used to perform any final modifications to the object
before its used to compute result.
Patching

Patching is the final step of SummarizeAndPatch(). It uses the patch helper
to patch the final form of the object. In cases where patching may be applied
to an object being deleted, to ignore the resource not found error, an option
IgnoreNotFound can be passed to SummarizeAndPatch().
Following is an implementation of SummarizeAndPatch() method, consisting of
the details described above:
func (h *Helper) SummarizeAndPatch(ctx context.Context, obj conditions.Setter, options ...Option) (ctrl.Result, error) {
	// Calculate the options.
	opts := &HelperOptions{}
	for _, o := range options {
		o(opts)
	}
	// Combined the owned conditions of all the conditions for the patcher.
	ownedConditions := []string{}
	for _, c := range opts.Conditions {
		ownedConditions = append(ownedConditions, c.Owned...)
	}
	// Patch the object, prioritizing the conditions owned by the controller in
	// case of any conflicts.
	patchOpts := []patch.Option{
		patch.WithOwnedConditions{
			Conditions: ownedConditions,
		},
	}

	// Process the results of reconciliation.
	for _, processor := range opts.Processors {
		processor(ctx, h.recorder, obj, opts.ReconcileResult, opts.ReconcileError)
	}

	var result ctrl.Result
	var recErr error
	if opts.ResultBuilder != nil {
		// Compute the reconcile results, obtain patch options and reconcile error.
		var pOpts []patch.Option
		pOpts, result, recErr = ComputeReconcileResult(obj, opts.ReconcileResult, opts.ReconcileError, opts.ResultBuilder)
		patchOpts = append(patchOpts, pOpts...)
	}

	// Summarize conditions. This must be performed only after computing the
	// reconcile result, since the object status is adjusted based on the
	// reconcile result and error.
	for _, c := range opts.Conditions {
		conditions.SetSummary(obj,
			c.Target,
			conditions.WithConditions(
				c.Summarize...,
			),
			conditions.WithNegativePolarityConditions(
				c.NegativePolarity...,
			),
		)
	}

	// Finally, patch the resource.
	if err := h.patchHelper.Patch(ctx, obj, patchOpts...); err != nil {
		// Ignore patch error "not found" when the object is being deleted.
		if opts.IgnoreNotFound && !obj.GetDeletionTimestamp().IsZero() {
			err = kerrors.FilterOut(err, func(e error) bool { return apierrors.IsNotFound(e) })
		}
		recErr = kerrors.NewAggregate([]error{recErr, err})
	}

	return result, recErr
}
An example usage of this in in the GitRepository reconciler looks like:
summarizeHelper := summarize.NewHelper(r.EventRecorder, patchHelper)
summarizeOpts := []summarize.Option{
	summarize.WithConditions(gitRepoReadyConditions),
	summarize.WithReconcileResult(recResult),
	summarize.WithReconcileError(retErr),
	summarize.WithIgnoreNotFound(),
	summarize.WithProcessors(
		summarize.RecordContextualError,
		summarize.RecordReconcileReq,
	),
	summarize.WithResultBuilder(sreconcile.AlwaysRequeueResultBuilder{RequeueAfter: obj.GetInterval().Duration}),
}
result, retErr = summarizeHelper.SummarizeAndPatch(ctx, obj, summarizeOpts...)
This covers all the details involved in the computation of the result of a
reconciler. The result computation and patching model descibed above can be
applied to any of the reconcilers independent of the core business logic. The
model provides options to define any domain specific modifications that may be
needed. It can be developed more to cover more cases that may not be addressed
in this document.