Skip to content

Instantly share code, notes, and snippets.

@todoshcenko
Last active July 13, 2023 09:23
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save todoshcenko/21f82bae5d5bbd89258d2949ee10812d to your computer and use it in GitHub Desktop.
Save todoshcenko/21f82bae5d5bbd89258d2949ee10812d to your computer and use it in GitHub Desktop.
chapter from "Linux Socket Programming by Example" from Warren Gay

Choosing a Socket Type

You already know that choosing a domain value for the socket(2) or socketpair(2) function chooses a protocol family to be used. For example, you know that

  • PF_LOCAL (which is the same as PF_UNIX) indicates that a local UNIX socket protocol family is being specified.
  • PF_INET indicates that the Internet family of protocols is used.

Consequently, you now have to learn about only two more input arguments.

The socket type argument in the socket(2) and socketpair(2) function calls indicates how a socket will interface with your program. But this is not the whole story, as this parameter also has implications for the protocol that is selected. (You'll understand this better as you progress through this chapter.)

The programmer typically chooses one of the following values for the socket type argument:

  • SOCK_STREAM *
  • SOCK_DGRAM *
  • SOCK_SEQPACKET
  • SOCK_RAW

The entries marked with an asterisk (*) are the two you'll normally use. The SOCK_SEQPACKET type is commonly used on non-Internet protocols such as X.25, or the amateur radio protocol AX.25. There are a few additional types that could be listed here. However, they are outside of the scope of this text.

NOTE

The SOCK_RAW macro specifies that the programmer wants a "raw" interface to the socket. This allows the programmer more direct control over the communications and its packets. However, it also requires an intimate knowledge of the protocol and its underlying packet structure. For this reason, the SOCK_RAW socket will not be studied in this book.

Understanding the SOCK_STREAM Socket Type

The SOCK_STREAM socket type is used when you want to perform stream I/O with a remote socket. A stream in the socket sense is the same concept that applies to a UNIX pipe. Bytes written to one end of the pipe (or socket) are received at the other end as one continuous stream of bytes. There are no dividing lines or boundaries. There is no record length, block size, or concept of a packet at the receiving end. Whatever data is currently available at the receiving end is returned in the caller's buffer.

An example to review might help illustrate the stream I/O concept. In this example, there is a local process on your host that has connected to a remote process on a remote host. The local host is going to send data to the remote host in two separate write(2) calls as follows:

  1. The local process writes 25 bytes of data to be sent to the remote process, by socket. The Linux kernel might or might not choose to buffer this data. Buffering helps improve the performance of the kernel and the network facilities.
  2. Another 30 bytes are written by the local process to be sent to the remote process.
  3. The remote process executes a function designed to receive data from the socket. The receiving buffer in this example allows up to 256 bytes to be read. The remote process receives the 55 bytes that were written in steps 1 and 2.

Note what has happened. The local process has performed two separate writes to the socket. These could be two different messages or two different data structures. Yet, the remote process received all of the written data as one combined unit of 55 bytes.

Another way to look at this example is that the local process might have had to create one message in two partial writes. The receiving end received the message as one combined unit.

At other times, depending on timing and buffer availability, the remote process might first get the original piece of 25 bytes (or perhaps even less). Then, on a successive receive function call, obtain the remaining 30 bytes. In short, a stream socket does not preserve any message boundary. It simply returns the data it has to the receiving application.

The receiving end cannot tell what the original message boundaries were. In our example, it cannot tell that the first write(2) was for 25 bytes and the second was for 30. All it can know is the data bytes that it received and that the total bytes sent was 55.

A stream socket has one other important property. Like a UNIX pipe, the bytes written to a stream socket are guaranteed to arrive at the other end in the exact same order in which they were written. With protocols such as IP, in which packets can take different routes to their destination, it frequently happens that later packets arrive ahead of their earlier cousins. The SOCK_STREAM socket ensures that your receiving application accepts data bytes in precisely the same sequence in which they were originally written.

Let's recap the properties of a SOCK_STREAM socket:

  • No message boundaries are preserved. The receiving end cannot determine how many write(2) calls were used to send the received data. Nor can it determine where the write(2) calls began or ended in the stream of bytes received.
  • The data bytes received are guaranteed to be in precisely the same order in which they were written.
  • All data written is guaranteed to be received by the remote end without error. If a failure occurs, an error is reported after all reasonable attempts at recovery have been made. Any recovery attempts are automatic and are not directed by your application program.

The last point presented is a new one to this discussion. A stream socket implies that every reasonable effort will be made to deliver data written to one socket, to the socket at the other end. If this cannot be done, the error will be made known to the receiving end as well as the writing end. In this respect, SOCK_STREAM socket is a reliable data transport. This feature makes it a very popular socket type.

There is one more property of the SOCK_STREAM type of socket. It is

  • The data is transported over a pair of connected sockets.

In order to guarantee delivery of data, and to enforce byte ordering, the underlying protocols use a connected pair of sockets. For the moment, simply know that the SOCK_STREAM type implies that a connection must be established before communications can proceed.

Understanding the SOCK_DGRAM Socket Type

There are some situations in which it is not absolutely required that data must arrive at the remote end in sequence. Additionally, it might not even be required that the data delivery be reliable. The following lists the characteristics of a SOCK_DGRAM socket type:

  • Packets are delivered, possibly out of order at the receiving end.
  • Packets might be lost. No attempt is made at recovering from this type of error. Nor is it necessarily known at the receiving end that a packet was lost.
  • Datagram packets have practical size limits. Exceeding these limits will make them undeliverable through certain routers and nodes.
  • Packets are sent to remote processes in an unconnected manner. This permits a program to address its message to a different remote process, with each message written to the same socket.

NOTE

Reliability is not a concern when noncritical logging information is transmitted. This information is transmitted on a "best efforts" basis. When a noncritical log packet is lost, it is considered an acceptable loss.

Unlike a streamed connected socket, a datagram socket simply passes data by individual packets. Remember that for protocols such as IP, individual packets can be routed different ways. This frequently causes packets to arrive at the destination in a different sequence from which they were sent. The socket type SOCK_DGRAM implies that receiving these messages out of order is acceptable to the application.

Sending a datagram packet is unreliable. If a packet is transmitted and not received correctly by an intervening router or the receiving host, then the packet is simply lost. No record of its existence is kept, and no attempt to recover from the transmission error is made.

Packets can also be lost if they are unsuitably large. Routers in the path between the sending host and the receiving host will drop a packet if it is too large or lacks the buffer space to pass it. Again, there is no error recovery implied in a SOCK_DGRAM socket when this happens.

The last characteristic that is of interest to you is the fact that the SOCK_DGRAM type does not imply a connection. Each time you send a message with your socket, it can be destined for another recipient. This property of the SOCK_DGRAM type makes it attractive and efficient.

A connection-oriented protocol, on one hand, requires that a connection establishment procedure be carried out. This requires a certain number of packets to be sent and received in order to establish the connection. The SOCK_DGRAM type, on the other hand, is efficient because no connection is established.

Before choosing to use SOCK_DGRAM, however, you must carefully weigh the following:

  • Need for reliability
  • Need for sequenced data
  • Data size requirements

Understanding the SOCK_SEQPACKET Socket Type

Although the SOCK_SEQPACKET type will not be used in this book, you should at least become familiar with it. This socket type is important for protocols such as X.25 and AX.25 that use it. It is very similar to SOCK_STREAM but has one subtle distinction. The difference is that although the SOCK_STREAM socket does not preserve message boundaries, the SOCK_SEQPACKET does. When X.25 is used, for example, and SOCK_SEQPACKET is chosen, each packet is received in the same unit size in which it was originally written.

For example, imagine the sending end performing the following two writes:

  1. Write a message one of 25 bytes.
  2. Write a message two of 30 bytes.

Although the receiving process might indicate that it can accept up to 256 bytes in one read(2) call, the following receive events will occur:

  1. A message will be received with a length of 25 bytes. This corresponds to the length of the first message that was written by the sending process.
  2. A second message will be received with a length of 30 bytes. This corresponds to the length of the second write of the sending process.

Although the receiving buffer was able to receive the total combined message length of 55 bytes, only the first message for 25 bytes is received by the first read(2) call on the socket. This tells the application that this message was precisely 25 bytes in length. The next call to read(2) will fetch the next message of 30 bytes, regardless of whether there is more data that could be returned.

With this behavior, you can see that SOCK_SEQPACKET preserves the original message boundaries. The following provides a summary of characteristics for this socket type:

  • Message boundaries are preserved. This feature distinguishes the SOCK_SEQPACKET type from the SOCK_STREAM type.
  • The data bytes received are guaranteed to be in precisely in the same order in which they were written.
  • All data written is guaranteed delivered to the remote end without error. If it cannot be delivered after reasonable attempts at automatic recovery, an error is reported to the sending and receiving processes.
  • The data is transported over a pair of connected sockets.

NOTE

Not all socket types can be used with all protocols. For example, SOCK_STREAM is supported for the PF_INET protocol family, but SOCK_SEQPACKET is not. Conversely for PF_X25, the socket type SOCK_SEQPACKET is supported, but SOCK_STREAM is not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment