Skip to content

Instantly share code, notes, and snippets.

@minhbq-99
Last active August 15, 2021 14:13
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save minhbq-99/14abd787a4b728b0605c79fac275ad70 to your computer and use it in GitHub Desktop.
Save minhbq-99/14abd787a4b728b0605c79fac275ad70 to your computer and use it in GitHub Desktop.
GSOC 2021

Google Summer of Code 2021

Add support for checkpoint/restore UDP socket's queue

Introduction

In this project, I implemented an UDP socket's send queue and receive queue checkpoint/restore support in CRIU. Besides, I also implemented a small patch to Linux kernel for dumping packet from UDP socket's send queue

Link to the pull request: checkpoint-restore/criu#1571
Link to my kernel patch series: [PATCH 0/2] [PATCH 1/2] [PATCH 2/2]

Implementation

  • For storing the dumped packets, a new protobuf image has been implemented. The image has the following format: UdpQueueEntry at the beginning for storing queues' information, each packet has UdpPacketEntry at the beginning for storing address information and packet's length followed by packet's data.
  • Checkpoint/restore send queue: in the kernel patch, I implement a new UDP_REPAIR sockoption to turn on/off socket's repair mode. When repair mode is on, we can use the recvmsg to get the send queue's packet and the destination address of it. However, this patch cannot dump some other information such as TTL, TOS, .. that can be set in cmsg provided when sending that packet. I also created a kernel selftest for this feature. With this patch, we can checkpoint the send queue's packet data and destination address. When restoring, we resend that packet through the checkpointed socket.
  • Checkpoint/restore receive queue: when checkpointing the receive queue, the packets in the receive queue are peeked and stored into the protobuf image. When restoring, a temporary local socket is created to send each packet. To modify the source IP addresses of these packets to match the dumped ones, I used IP_PKTINFO (IPV6_PKTINFO for AF_INET6) and IP_TRANSPARENT (IP_TRANSPARENT is used so that the kernel does not check whether the source address is from a local interface). The temporary is bound to the same port as the original packet's source port. In order to reduce the port collision, I use IP_FREEBIND to bind the socket to a non-local (fake) address. However, in case the port is already used to all interfaces, we cannot bind to that port, the packet is dropped.
  • 2 zdtm tests are implemented for testing this new feature in AF_INET and AF_INET6 UDP sockets

Future work

  • The current patch is not accepted by the Linux kernel maintainers because they don't want to have a check in UDP's revcmsg fast path. I will solve this problem by adding a new option in UDP's getsockopt in the future patch.

Some interesting things

  • The default peek offset of a socket is -1, it behaves like 0 but does not change after a MSG_PEEK recvmsg. Therefore, the next MSG_PEEK recvmsg returns exactly the same packet.
  • Send a packet to address A with corked UDP socket or with MSG_MORE, after that, send another packet to address B. That packet is eventually concatenated with first packet and sent to address A.

Conclusion

I want to thank my mentors Alexander Mikhalitsyn and Andrei Vagin for being very supportive to me through the project. This project helps me to learn a lot about Linux programming (kernel and userspace) and computer network. I would love to participate in more feature development for CRIU in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment