Skip to content

Instantly share code, notes, and snippets.

@saurabh500
Last active July 19, 2024 23:14
Show Gist options
  • Save saurabh500/7469b01afdca0d705ef01240a65b9d07 to your computer and use it in GitHub Desktop.
Save saurabh500/7469b01afdca0d705ef01240a65b9d07 to your computer and use it in GitHub Desktop.
Streams in SQL Client

Purpose:

Starting this discussion thread to brainstorm the future of SqlClientX I/O to be performant, while adhering to the various requirements

Transport requirements

  1. Named Pipes
  2. Shared Memory (it is essentially Named pipe with a special NP path)
  3. TCP

TDC protocol basics

Sql server clients and server exchange information as messages. Messages contain a series of Packets.

Messages are a logical concept, where the last packet of the message has a status field with value of EOM which states that its the last packet of the message.

Message

Packet 1 Packet 2 Packet 3 ... Last Packet
[x, 0x04,x,x,x,x,x] [x, 0x04,x,x,x,x,x] [x, 0x04,x,x,x,x,x] ... [x, 0x01,x,x,x,x,x]

Outgoing Packet Header

Message Type (Prelogin/login etc) Status (0x1 signifies last packet of message) Length (2 bytes) Unused byte Unused byte Packet number Unused byte (window)
1/2/3/4 0x01 0x04 0x02 Length including 8 bytes of header ... ... 1...Count ...

The rest of the bytes in the packet follow the TDS protocol spec and structure is dependant on the message being transmitted.

Incoming packet header

The incoming packet header follows a similar structure, but some unused bytes in outgoing packet, end up being meaningful

Message Type (Prelogin/login etc) Status (0x1 signifies last packet of message) Length (2 bytes) SPID (2 bytes) Packet number Window (unused)
1/2/3/4 0x01 0x04 0x02 Length including 8 bytes of header SPID of connection 1...Count ...

Packets are buffers which have a pre-negotiated size with the server. The packet size is 4096 till negotiated at the end of login. The default packet size is 8000 bytes in connection string

After the login negotiation, all the packets on the connection, except the last packet of a message must be the negotiated size.

For simplicity of discussion we will use a packet size of 8k, but this is modifiable in the connection string. Details of values is https://learn.microsoft.com/en-us/dotnet/api/system.data.sqlclient.sqlconnection.packetsize?view=netframework-4.8.1

Writes

While sending out the data over TDS, the client needs to fill in the payload in the TDS packet, and set the appropriate header bytes and flush the packet to the network.

In case of writes, the Message type, status, length of data, packet number need to be adjusted for every packet.

Reads

In case of reads, the client needs to read the packet, and parse out the header to understand how much data is expected. Though there is a status bit, it is omitted, and the client rely on the size of data specified in the header to parse the information and parse it according to the TDS protocol.

However in case of reading the packets, the client needs to assume that the complete TDS packet may be readable in a single transport read, and needs to account for partial packet being read.

  1. Partial packet header read: which means that the number of bytes read are less than 8 bytes. In this case the client needs to atleast have 8 bytes packet header to understand how much more data to expect in the packet.

  2. Full header read, but partial payload read: In this case the client could have enough information to respond to the APIs, but for most parts would need the rest of the packets.

Streams

TDS Write stream

In the current implementation of ClientX, writing the TDS packet is modeled as a TdsWriteStream. This stream ingests the TDS packet payload. It needs the Type of stream being sent out. If the incoming data spans across TDS packets, then the flushes the packet to the network. While writing data spanning multiple TDS packet, the stream calculates the packet number, the status message and the data length, and send out a Packet with 0x04 status.

When a Flush / FlushAsync is called on the stream, it assumes that no more data needs to be sent in this message and sets the EOM status (0x01) of the packet, computes the packet number, the length and flushes the buffer to the network.

TDS Read Stream

In the current implementation of ClientX, the consumers of data from the network, calls into the stream to get the data as bytes, the consumers stick to the TDS protocol, and are expected to request the right data. The stream takes care of making sure that it can account for split packets. If the stream has the data available in the buffer, it returns it, else it will read from the underlying transport and read the requested number of bytes, and return them to the consumer.

The consumer is expected to read the data and use it.

Motivation

The idea of using streams was to make the consumers unaware of the nuances of TDS packet header (for writes) and not having to worry about split packets while reading from the network (in read).

Streams offer a nice layering mechanism too, where the TDS stream can be layered with SSL Stream, which can be layered with SSL over tds stream, and then the actual transport stream. This separates the responsibilities of the stream. There are also intentions of adding a layer of MARS stream under the TDS stream, which will be backed by a Muxer and Demuxer which has a 1x1 relationship with the SSL/Transport stream.

Each TDS stream would get its own MARS stream, which will write / read to/from muxer-demuxer which will understand the nuances of MARS over a single physical transport stream.

This was to further separate the concerns and have a layered approach in adding features to ClientX.

image

Deviation from stream

There are new methods on TdsStream in addition to what System.IO.Stream offers like peek byte, read byte async etc, since each of these methods could result in an I/O operation.

image

Fig 1: Current implementation in ClientX

image

Fig 2: Potential improvement

Problem

  1. While the streams are successful at abstracting away the complexities of TDS packet header, they rely on the conumers to bring in their own buffer to copy the data into, and then consume them. Or the writers need to provide a smart way of going from the CLR types to the byte[], which may need another layer of buffer management. e.g. Consider writing an int value to the stream, this needs to be converted to a byte[] before it can be handed down to the stream for writing. This means that some kind of buffer management needs to be exposed above streams. Consider writing a string to the TDS stream. This would require that the string be converted to a byte[] first, and then written to the stream. This can either be done with chunked buffers, or this could have been made better by writing the string to the available buffer in the stream itself, flushing it, and repeating with the rest of the string, till the whole string is flused out, without needing an intermediate buffer.

It would benefit ClientX to really expose the buffers being used in the Write/Read streams, so that the allocations needed while going from SQL types or CLR types to the byte[] array would not require an additional layer of buffer management.

  1. Async: While doing any reads/write operations, due to the nature of the protocol, the readers/writers may have to do network I/O calls, based on the space available in the buffer while writing and data available in the buffer while reading. We started clientX with a "written for async but usable for sync" philosophy. While the streams lend itself well to this philosophy, it almost always causes a statemachine to be created and executed if the code is written with async await patterns. The alternative is to use ValueTask and check for its completion or setup continuations. Streams also cause 1 more level of depth in the call stack, likely causing another statemachine to be generated for every async operation, where the data may just need to copied over to the buffer in the stream.

Potential solution

  1. Expose the buffer for the streams directly. Allow the writers/readers of the CLR types to manipulate/read from the buffer if it has space/data available. When the buffer is full, then use streams to flush the buffer. Else the call to filling up buffer, or reading from a filled up buffer always completes synchronously, without unnecessary statemachines.

  2. For the "async fist and use for sync philosophy" The readers and writers will almost always have to have a return type of ValueTask / Task being exposed. However the readers / writers and their consumers need not always use async await and intelligently manage the ValueTask / Task being returned from network reads/writes.

Pipelines

Pipelines in principle are great for the purposes of SqlClient. Pipelines let the consumers use the buffers directly, which is an important takeaway from the above, however we have the following hurdles with pipelines for which solutions are needed

  1. Sync support. Pipelines are async
  2. Support on NetFx. SqlClient needs to cater to the customers on .Net Framework as well.
  3. Named Pipe support. Do pipelines lend themselves well to non-TCP transport?

Protocols to be supported

  1. TCP
  2. Named Pipes
  3. Shared Memory

Named pipes works cross operating system on Windows. Shared memory is a named pipe with a special pipename. [Fill in the pipename here]

TDS Protocol

TDS protocol allows exchange of data in TDS messages, where each message has a type.

Each message consists of Packets. Cardinatility Message [1] ... Tds Packets [n]

The Tds packet size is negotiated between the server and client during connection open (specifically during login).

The negotiated packet size is then used by the client to send information to the server. The client strictly adheres to the packet size, unless it is sending out the last packet in the message.

Whether the packet is the last packet or not, is stated by the Status of the packet.

MARS (multiple active result sets)

In case of MARS, each connection can be associated with multiple commands. Every connection can be used by multiple readers to read data, without really finishing the operation of an open reader.

MARS requies control packets to be sent to the server, when we open the connection. This makes sure that

The information from each TDS writer is then multiplexed to be written to the network. The TDS writer requires demultiplexing of the packets on a connection. A TDS writer maps to a SqlCommand and the related DataReader.

MARS is interesting, because it has flow control built into it. The data can be written to the connection, only if the server has acknowledged that it is ready to receive more from the client.

Each Tds Writer and reader, is tied to a session ID which is created on top of MARS. MARS requires that each packet is encapsulated in the MARS header and sent out to the server.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment