Skip to content

Instantly share code, notes, and snippets.

@rzikm
Last active November 7, 2023 16:50
Show Gist options
  • Save rzikm/f77fe0a8832fbc165d5512de4a5b5f8c to your computer and use it in GitHub Desktop.
Save rzikm/f77fe0a8832fbc165d5512de4a5b5f8c to your computer and use it in GitHub Desktop.
QUIC Datagram API

Background and Motivation

RFC 9221 defines an optional extesion to QUIC which allows unreliable sending of arbitrary user data.

The QUIC transport protocol (RFC 9000) provides a secure, multiplexed connection for transmitting reliable streams of application data. QUIC uses various frame types to transmit data within packets, and each frame type defines whether the data it contains will be retransmitted on packet loss. [...]

Some applications, particularly those that need to transmit real-time data, prefer to transmit data unreliably. In the past, these applications have built directly upon UDP as a transport and have often added security with DTLS. Extending QUIC to support transmitting unreliable application data provides another option for secure datagrams with the added benefit of sharing the cryptographic and authentication context used for reliable streams.

Underlying library used for QUIC -- MsQuic -- already supports the Datagram extension.

Relevant passages from the RFC for the API design

Application protocols that use datagrams MUST define how they react to the absence of the max_datagram_frame_size transport parameter. If datagram support is integral to the application, the application protocol can fail the handshake if the max_datagram_frame_size transport parameter is not present.

=> The proposed API must allow user code to decide whether the connection should be aborted when peer does not advertise Datagram support.

Note that while the max_datagram_frame_size transport parameter places a limit on the maximum size of DATAGRAM frames, that limit can be further reduced by the max_udp_payload_size transport parameter and the Maximum Transmission Unit (MTU) of the path between endpoints. DATAGRAM frames cannot be fragmented; therefore, application protocols need to handle cases where the maximum datagram size is limited by other factors.

MsQuic exposes the current maximum size of a Datagram payload which can be sent, but that size can potentially change over the lifetime of the connection as MTU changes (e.g. due to connection migration). Hence a size read from a simple getter is potentially immediately outdated.

QUIC implementations SHOULD present an API to applications to assign relative priorities to DATAGRAM frames with respect to each other and to QUIC streams.

MsQuic supports setting relative priority between QUIC Streams, but currently, DATAGRAM frames are always higher priority than QUIC Streams. System.Net.Quic currently does not expose API to set QuicStream priorities, but we may add it in the future.

If a sender detects that a packet containing a specific DATAGRAM frame might have been lost, the implementation MAY notify the application that it believes the datagram was lost.

Similarly, if a packet containing a DATAGRAM frame is acknowledged, the implementation MAY notify the sender application that the datagram was successfully transmitted and received. Due to reordering, this can include a DATAGRAM frame that was thought to be lost but, at a later point, was received and acknowledged.

Exposing this information is not necessary and not all implementations do it (currently, only MsQuic seems to support it). In the initial design we will not expose this information to the user.

How other QUIC implementations expose API for datagrams

Library Receiving Sending Behavior when queue full Datagram state tracking
MsQuic event callback no internal buffering, event callback after send and other state changes N/A (up to user) yes
lsquic callback callback N/A (up to user) no
mvfst event callback? queueing datagrams buffer size and drop policy configurable no
cloudflare/quiche queueing datagrams queueing datagrams buffer size configurable no
google/quiche N/A N/A N/A N/A
aioquic event callback queueing datagrams no buffer configuration, no dropping no
Details for individual libraries

MsQuic

MsQuic

MsQuic is the only implementation which supports datagram frame tracking. State of the datagram is notified via event callback which returns the client context pointer and datagram state.

MsQuic notifies of following state changes for sent Datagrams:

typedef enum QUIC_DATAGRAM_SEND_STATE {
    QUIC_DATAGRAM_SEND_UNKNOWN,                         // Not yet sent.
    QUIC_DATAGRAM_SEND_SENT,                            // Sent and awaiting acknowledegment
    QUIC_DATAGRAM_SEND_LOST_SUSPECT,                    // Suspected as lost, but still tracked
    QUIC_DATAGRAM_SEND_LOST_DISCARDED,                  // Lost and not longer being tracked
    QUIC_DATAGRAM_SEND_ACKNOWLEDGED,                    // Acknowledged
    QUIC_DATAGRAM_SEND_ACKNOWLEDGED_SPURIOUS,           // Acknowledged after being suspected lost
    QUIC_DATAGRAM_SEND_CANCELED,                        // Canceled before send
} QUIC_DATAGRAM_SEND_STATE;

lsquic

lsquic requires on_datagram(lsquic_conn_t*, const void* buf, size_t) callback. For writing, user supply on_dg_write(conn, buf, sz) callback and write directly to the buffer.

No datagram tracking.

mvfst

mvfst has config which specifies how many datagrams are buffered (separately send/receive) and whether to drop the oldest or newest datagrams when the buffer is full.

struct DatagramConfig {
  bool enabled{false};
  bool framePerPacket{true};
  bool recvDropOldDataFirst{false};
  bool sendDropOldDataFirst{false};
  uint32_t readBufSize{kDefaultMaxDatagramsBuffered};
  uint32_t writeBufSize{kDefaultMaxDatagramsBuffered};
};

Receiving itself is again done via callback (onDatagramsAvailable), and sending is via queueing datagrams (writeDatagram).

AFAICT, no datagram tracking.

cloudflare/quiche

quiche has an internal send/receive queue for datagrams and maximum number of bytes stored. No policy settings for dropping datagrams after the queue is full.

API below for illustration I copied C API for brevity

// Returns the maximum DATAGRAM payload that can be sent.
ssize_t quiche_conn_dgram_max_writable_len(const quiche_conn *conn);

// Returns the length of the first stored DATAGRAM.
ssize_t quiche_conn_dgram_recv_front_len(const quiche_conn *conn);

// Returns the number of items in the DATAGRAM receive queue.
ssize_t quiche_conn_dgram_recv_queue_len(const quiche_conn *conn);

// Returns the total size of all items in the DATAGRAM receive queue.
ssize_t quiche_conn_dgram_recv_queue_byte_size(const quiche_conn *conn);

// Returns the number of items in the DATAGRAM send queue.
ssize_t quiche_conn_dgram_send_queue_len(const quiche_conn *conn);

// Returns the total size of all items in the DATAGRAM send queue.
ssize_t quiche_conn_dgram_send_queue_byte_size(const quiche_conn *conn);

// Reads the first received DATAGRAM.
ssize_t quiche_conn_dgram_recv(quiche_conn *conn, uint8_t *buf,
                               size_t buf_len);

// Sends data in a DATAGRAM frame.
ssize_t quiche_conn_dgram_send(quiche_conn *conn, const uint8_t *buf,
                               size_t buf_len);

google/quiche

quiche does not seem to have datagram support yet.

aioquic

aioquic has event-callback architecture, datagrams are received via DatagramFrameReceived event. Sending is done via send_datagram_frame function on the connection, which queues the datagram for sending and returns immediately.

There is no prioritization scheme or configuration for dropping datagrams.

No datagram tracking.

Proposed API

The API design has several independent parts: sending, receiving, and requiring datagram support from the peer. The proposed APIs and their alternatives for each part can be reviewed independently so we grouped them accordingly for hopefully better readability.

Sending datagrams

The proposed sending API utilizes internal buffering so that user can queue multiple datagrams for sending without waiting for the previous datagram to be sent. Similar strategy is used to send data over QuicStreams.

namespace System.Net.Quic
{
    public abstract class QuicConnectionOptions
    {
        // maximum number of bytes used to buffer outgoing datagrams. If the
        // queue is full, further datagrams are dropped.
        //
        // TODO: what should be the default value? MsQuic does not buffer the data but holds onto a pointer
        // to the buffer we give it.
+       public int DatagramSendQueueLength { get { throw null; } set { } }

        // If set, will notify about changes in `DatagramMaxSendLength` field
        // The user can query updated `DatagramMaxSendLength` or the callback type can be changed to
        // return it as parameter.
+       public Action<QuicConnection>? MaxDatagramSendLengthChanged { get { throw null; } set { } }
    }

    public sealed class QuicConnection
    {
        // Returns true if peer advertised QUIC DATAGRAM support and this
        // connection can send datagrams. This value is constant over the
        // lifetime of the connection
+       public bool DatagramSendEnabled { get { throw null; } }

        // Gets maximum amount of data possible to sent via a single Datagram.
        // Note that this may change without user interaction due to MTU changes
        // on the underlying network.
        //
        // Returns 0 if DatagramSendEnabled is false.
+       public int MaxDatagramSendLength { get { throw null; }}

        // Queues datagram for sending to the peer, data is internally buffered.
        // (reminder that datagrams are unreliable and may get lost).
        // Returns:
        //   - true datagram was queued successfully for sending. Note that if
        //     max MTU is decreases below the datagram size while datagram is
        //     waiting for sending, it will still get discarded.
        //   - false otherwise (datagram too big or queue is full)
        // Throws:
        //   - ObjectDisposedException if QuicConnection was disposed
        //   - QuicException if QuicConnection was aborted
        //   - InvalidOperationException if DatagramSendEnabled is false
+       bool SendDatagram(ReadOnlySpan<byte> buffer);
    }
}

Alternative: no internal queue

By making SendDatagramAsync block until the datagram is sent, we can avoid internal buffering and avoid the copying (the provided buffer gets pinned and pointer passed to MsQuic).

The straightforward implementation would support queueing only one datagram frame at a time, but it should be possible to support overlapping sends as well if desired.

    public abstract class QuicConnectionOptions
    {
        // not necessary for this alternative
-       public int DatagramSendQueueLength { get { throw null; } set { } }
    }

    public sealed class QuicConnection
    {
-       bool SendDatagram(ReadOnlySpan<byte> buffer);

        // Queues datagram for sending to the peer.
        // Returns:
        //   - true if datagram was sent successfully (note that it may still get lost)
        //   - false otherwise (may not be synchronously if MTU drops while datagram is queued)
        // Throws:
        //   - ObjectDisposedException if QuicConnection was disposed
        //   - QuicException if QuicConnection was aborted
        //   - InvalidOperationException if DatagramSendEnabled is false
+       ValueTask<bool> SendDatagramAsync(ReadOnlyMemory<byte> buffer, CancellationToken cancellationToken = default);
    }

Alternative: returning enum instead of bool

+   enum SendDatagramResult
    {
        // successfully sent
        Sent,

        // returns synchronously
        DatagramTooLarge,

        // returns asynchronously if MTU drops after datagram was queued
        Cancelled,
    }

    public sealed class QuicConnection
    {
-       ValueTask<bool> SendDatagramAsync(ReadOnlyMemory<byte> buffer, CancellationToken cancellationToken = default);
+       ValueTask<SendDatagramResult> SendDatagramAsync(ReadOnlyMemory<byte> buffer, CancellationToken cancellationToken = default);
    }

I don't think there is value for the user to know the distinction between DatagramTooLarge and Cancelled. In both cases the users will likely need to check DatagramMaxSendLength before constructing future datagrams.

Receiving Datagrams

The suggested design employs a callback on the user's side which will get called each time a QUIC Datagram frame is received (in other words: push model). Since the callback is called from a background thread servicing the connection, the user is expected to quickly process the frame (or queueing it elsewhere for more intense processing).

namespace System.Net.Quic
{
    public abstract class QuicConnectionOptions
    {
        // alternative name: DatagramReceiveCallback
+       public delegate void ReceiveDatagramCalback(QuicConnection connection, ReadOnlySpan<byte> buffer);
        // Invoked when a Datagram is received from the peer, setting this to non-null will enable
        // receiving Datagrams.
+       public ReceiveDatagramCalback? ReceiveDatagramCalback { get { throw null; } set { } }
    }

    public sealed class QuicConnection
    {
        // For symmetry with DatagramSendEnabled only, not necessary to use the API.
+       public bool DatagramReceiveEnabled { get { throw null; } }
    }

While this alternative is the most versatile, it may be difficult to setup the right callback for the server to include the context relevant for the QuicConnection. E.g. AspNetCore creates the necessary context in QuicListenerOptions.ConnectionOptionsCallback and relies on ConditionalWeakTable to lookup the context after it is received from AcceptConnectionAsync. Source code is at https://github.com/dotnet/aspnetcore/blob/release/8.0/src/Servers/Kestrel/Transport.Quic/src/Internal/QuicConnectionListener.cs#L62C20-L96.

Alternative design involving an internal queue and an explicit Receive method requires more configuration knobs and is described later below.

Alternative: Separate member for DatagramReceiveEnabled

May help readability but is technically redundant.

    public abstract class QuicConnectionOptions
    {
        // existing. Validaton will throw if DatagramReceiveEnabled is true and no callback was provided
        public ReceiveDatagramCalback? ReceiveDatagramCalback { get { throw null; } set { } }

        // If true, the connection advertises datagram support
+       public bool DatagramReceiveEnabled { get { throw null; } set { } }
    }

Alternative: Receiving via pull-model (method)

API which stores incoming datagrams in an internal queue which user polls is more complicated and requires more options to suit application needs.

namespace System.Net.Quic
{
    public abstract class QuicConnectionOptions
    {
-       public delegate void ReceiveDatagramCalback(QuicConnection connection, ReadOnlySpan<byte> buffer);
-       public ReceiveDatagramCalback? ReceiveDatagramCalback { get { throw null; } set { } }

        // If true, the connection advertises datagram support
+       public bool DatagramReceiveEnabled { get { throw null; } set { } }

        // maximum number of **bytes** used to buffer incoming datagrams
+       public int DatagramReceiveQueueLength { get { throw null; } set { } }

        // determines behavior when the internal buffer is full. If true
        // the oldest datagrams are dropped until space is made for the new incoming datagram.
        // If false, new datagrams are discarded if internal buffer is full.
        //  - Alternatively, can be an two-member enum like: DatagramReceiveDropPolicy.Drop(Oldest|Newest)Frist.
+       public bool DropOldestDatagramsFirst { get { throw null; } set { } }
    }

    public sealed class QuicConnection
    {
        // Returns true if datagram was received, false otherwise.
        // if buffer was too small, returns false and sets bytesReceived to the required size.
        // if no datagram was available, returns false and sets bytesReceived to 0.
        //
        // intentionally does not return ReadOnlyMemory<byte> instances to allow pooling internal memory.
        // Throws
        //   - ObjectDisposedException if QuicConnection was disposed
        //   - QuicException if QuicConnection was aborted
        //   - InvalidOperationExceptions if DatagramReceiveEnabled was false
+       public bool TryReceiveDatagram(Span<byte> buffer, out int bytesReceived) { throw null; }

        // blocks until there is an incoming datagram available.
        // suggestions for better name are welcome
        // Throws
        //   - ObjectDisposedException if QuicConnection was disposed
        //   - QuicException if QuicConnection was aborted
        //   - InvalidOperationExceptions if DatagramReceiveEnabled was false
+       public ValueTask WaitForIncomingDatagramAsync(CancellationToken cancellationToken = default) { throw null; }
    }
}

Requiring peer datagram support

From RFC: If datagram support is integral to the application, the application protocol can fail the handshake if the max_datagram_frame_size transport parameter is not present.

namespace System.Net.Quic
{
    public abstract class QuicConnectionOptions
    {
        // Get or sets whether the Datagram support is required from the peer.
        // If true and the peer does not advertise willingness to receive
        // Datagrams, the connection will get terminated during the handshake.
        //
        // Note that since the handshake has not completed, Application-level
        // error code is not required. Connection is closed either with
        // UserCanceled TLS alert.
        // TODO: verify the above paragraph
+       public bool RequireDatagramSend { get { throw null; } set { } }
    }
}

Alternative: Callback

Alternative approach is to inform user once we know for sure if datagram support is enabled.

    public abstract class QuicConnectionOptions
    {
-       public bool RequireDatagramSend { get { throw null; } set { } }

        // If set, will be invoked once it is known whether the datagram support is known,
        // if false is returned, then handshake is aborted.
+       public Func<QuicConnection, bool>? RemoteDatagramSendValidationCallback { get { throw null; } set { } }
    }

At the time of writing this proposal, the only reasonable values seem to be either null (default) for "don't care" or (c, enabled) => enabled for "required". The version with boolean property is more readable.

Open Questions

Priorities: The above API does not allow expressing priorities between datagrams. MsQuic supports two priorities: queue to the front or queue to the back of the queue. The API for prioritization can be potentially added in the future as another overload of the send method accepting a priority value.

Usage Examples

TODO

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment