Skip to content

Instantly share code, notes, and snippets.

@tarekgh
Last active February 11, 2020 17:36
Show Gist options
  • Save tarekgh/01cdf2036e2908db44a589dac9792d96 to your computer and use it in GitHub Desktop.
Save tarekgh/01cdf2036e2908db44a589dac9792d96 to your computer and use it in GitHub Desktop.
Activity Proposed Additions

Activity Proposed Additions

OpenTelemetry has introduced the Span type which is very similar to System.Diagnostics.Activity. In general .NET libraries expected to use System.Diagnostics APIs to publish the tracing data. We need to support the scenario of allowing the .NET libraries exporting the published data through OpenTelemetry. As the OpenTelemetry exporting APIs work with the Span class, we need to ensure all features supported by the Span class can be achieved by the Activity class too and fill any gap between the Span and Activity.

Although Activity has some more properties than Span (e.g. Parent property), Span also support some properties which not exist in Activity. This document is listing these missing properties and proposing the APIs we need to add to Activity. Most of the proposed additions here are almost identical to what OpenTelemetry has.

Activity Missing Properties

Note

  • All proposed types will implement Equals, GetHashCode and ToString methods which not explicitly mentioned in the proposal and possibly we may override the == and != operators on some of them.

Events

Event is a type carry event name, optional event timestamp and optional attributes (key-value pair list). Activity will have a property called Events which will return the list of the attached events. Activity will provide a method called AddEvent to allow adding a new event to the Activity.

namespace System.Diagnostics
{
    public readonly struct ActivityEvent
    {
        // Constructors
        public ActivityEvent(string name) {}
        public ActivityEvent(string name, DateTimeOffset timestamp) {}
        public ActivityEvent(string name, IDictionary<string, object> attributes) {}
        public ActivityEvent(string name, DateTimeOffset timestamp, IDictionary<string, object> attributes) {}

        // Properties
        public string Name { get; }
        public DateTimeOffset Timestamp { get; }
        public IDictionary<string, object> Attributes { get; }
    }
}

Links

Link contains ActivityContext object which can be linked to the activity which being processed in the batch. Activity will provide a method called AddLink to allow adding a new link to the context.

namespace System.Diagnostics
{
    public readonly struct ActivityContext
    {
        public ActivityContext(in ActivityTraceId traceId, in ActivitySpanId Id, ActivityTraceFlags traceOptions, bool isRemote = false, IEnumerable<KeyValuePair<string, string>> tracestate = null) {}

        public ActivityTraceId TraceId { get; }
        public ActivitySpanId SpanId { get; }
        public ActivityTraceFlags TraceOptions { get; }

        // is a boolean flag which returns true if the ActivityContext was propagated from a remote parent.
        public bool IsRemote { get; }

        // is a boolean flag which returns true if the ActivityContext has a non-zero TraceID and a non-zero SpanID.
        public bool IsValid { get; }
    }

    public readonly struct ActivityLink
    {
        public ActivityLink(ActivityContext context) {}
        public ActivityLink(ActivityContext context, IDictionary<string, object> attributes) {}

        public ActivityContext Context { get; }
        public IDictionary<string, object> Attributes { get; }
    }
}

Status

Status is representing the activity execution status code. It contains the canonical code and optionally a description of the code. The Status type doesn't expose any public constructor and instead it expose a static properties which return a well defined Status objects for specific supported canonical codes.

namespace System.Diagnostics
{
    public enum ActivityCanonicalCode
    {
        Ok = 0,
        Cancelled = 1,
        Unknown = 2,
        InvalidArgument = 3,
        DeadlineExceeded = 4,
        NotFound = 5,
        AlreadyExists = 6,
        PermissionDenied = 7,
        ResourceExhausted = 8,
        FailedPrecondition = 9,
        Aborted = 10,
        OutOfRange = 11,
        Unimplemented = 12,
        Internal = 13,
        Unavailable = 14,
        DataLoss = 15,
        Unauthenticated = 16,
    }

    public readonly struct ActivityStatus
    {
        public ActivityCanonicalCode CanonicalCode { get; }
        public string Description { get; }

        // Convenient property which expected to be frequently used.
        public bool IsOk => this.CanonicalCode == CanonicalCode.Ok;

        // Create a new status object with CanonicalCode from this object and using a different description.
        public ActivityStatus CreateWithDescription(string description) {}

        public static readonly Status Ok => ...;
        public static readonly Status Cancelled => ...;
        public static readonly Status Unknown => ...;
        public static readonly Status InvalidArgument => ...;
        public static readonly Status DeadlineExceeded => ...;
        public static readonly Status NotFound => ...;
        public static readonly Status AlreadyExists => ...;
        public static readonly Status PermissionDenied => ...;
        public static readonly Status Unauthenticated => ...;
        public static readonly Status ResourceExhausted => ...;
        public static readonly Status FailedPrecondition => ...;
        public static readonly Status Aborted => ...;
        public static readonly Status OutOfRange => ...;
        public static readonly Status Unimplemented => ...;
        public static readonly Status Internal => ...;
        public static readonly Status Unavailable => ...;
        public static readonly Status DataLoss => ...;
    }
}

Open Questions

  • In OpenTelemetry side Status has IsValid property which can tell if the Status is one of the predefined canonical code status while Status public interface doesn't expose any public constructor and instead it expose static property returning specif status. Theoretically IsValid is not really needed.
  • OpenTelemetry has a method which allow creating a status object with one of the existing canonical codes but attaching a description. I don't think we need to do that and maybe we can has the description of all codes in our resources which will allow localization too at some point.
  • Do we need to have a public constructor allow creating some non standard status objects? I am thinking in the future we may have some codes that is not defined yet and having such constructor will allow using such code.

Kind

Describes the relationship between the Activity, its parents, and its children in a Trace. Activity will have a property that can set or get the activity kind.

    public enum ActivityKind
    {
        Internal = 1,   // Default value. Indicates that the span represents an internal operation within an application, as opposed to an operations with remote parents or children.
        Server = 2,     // Server activity represents request incoming from external component.
        Client = 3,     // Client activity represents outgoing request to the external component.
        Producer = 4,   // Producer activity represents output provided to external components.
        Consumer = 5,   // Consumer span represents output received from an external component.
    }

Activity Changes

namespace System.Diagnostics
{
    public partial class Activity
    {
        // Event Support
        public void AddEvent(in ActivityEvent event) {}
        public IEnumerable<ActivityEvent> Events {get;}

        // Link Support
        public void AddLink(in ActivityLink link) {}
        public IEnumerable<ActivityLink> Links {get;}

        // Status
        public ActivityStatus Status {get; set;}

        // Kind
        public ActivityKind Kind {get; set;}
    }
}
@lmolkova
Copy link

lmolkova commented Feb 11, 2020

Events

From the current scenarios we have, we have some need to attach rich exceptions to spans (this is Azure SDK scenario) and we might have some need to track some other events (e.g. chunks of data/messages during communications that are not worth creating a new span - gRPC scenario).

The only problem with that is dependency on Microsoft.Extensions.Logging and if library doesn't have it, it has to leverage DiagnosticsSource/EventSource. At this point, we either have to come up with complicated convention or won't be able to have generic solution.

As far as I understand this is library concern only - services/application can use logging library of choice and we can support them with ILogger and give guidance how to correlate logs to Activities for other loggers.

I agree that removing events (with possibility to return them back later) is the most reasonable option at this point. Thanks for bringing it, @noahfalk

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment