Skip to content

Instantly share code, notes, and snippets.

@mcastelino
Last active December 11, 2023 02:16
Show Gist options
  • Star 21 You must be signed in to star a gist
  • Fork 7 You must be signed in to fork a gist
  • Save mcastelino/7d85f4164ffdaf48242f9281bb1d0f9b to your computer and use it in GitHub Desktop.
Save mcastelino/7d85f4164ffdaf48242f9281bb1d0f9b to your computer and use it in GitHub Desktop.
Using tc redirect to connect a virtual machine to a container network

Connecting a veth device to tap

  • veth device from CNI/CNM plugin: eth0
  • tap device that connects to the VM: tap0

Redirecting traffic between the two devices

tc qdisc add dev eth0 ingress
tc filter add dev eth0 parent ffff: protocol all u32 match u8 0 0 action mirred egress redirect dev tap0
tc qdisc add dev tap0 ingress
tc filter add dev tap0 parent ffff: protocol all u32 match u8 0 0 action mirred egress redirect dev eth0

tc qdisc add dev eth0 ingress

  • Add a queuing discipline
  • on dev eth0
  • attach the ingress qdisc Here the handle defaults to ffff:

tc filter add dev eth0 parent ffff: protocol all u32 match u8 0 0 action mirred egress redirect dev tap0

  • Add a filter
  • to device dev eth0
  • to parent (class) handle to which we are attaching, ffff: i.e. ingress which we created before (there is no need for tc class add in the ingress case as it does not support classful queuing discplines).
  • protocol all
  • classifier u32
  • parameters to the classifier u8 0 0, and the first byte of the packet with 0 and if the result is 0 (which it always will be) (i.e. always true)
  • action mirred egress redirect dev eth0, redirect the packet to egress of dev eth0
@mcastelino
Copy link
Author

mcastelino commented Oct 16, 2018

https://www.tldp.org/HOWTO/html_single/Traffic-Control-HOWTO/

A source of terminology confusion is the usage of the terms root qdisc and ingress qdisc. These are not really queuing disciplines, but rather locations onto which traffic control structures can be attached for egress (outbound traffic) and ingress (inbound traffic).

Each interface contains both. The primary and more common is the egress qdisc, known as the root qdisc. It can contain any of the queuing disciplines (qdiscs) with potential classes and class structures. The overwhelming majority of documentation applies to the root qdisc and its children. Traffic transmitted on an interface traverses the egress or root qdisc.

For traffic accepted on an interface, the ingress qdisc is traversed. With its limited utility, it allows no child class to be created, and only exists as an object onto which a filter can be attached. For practical purposes, the ingress qdisc is merely a convenient object onto which to attach a policer to limit the amount of traffic accepted on a network interface.

@amshinde
Copy link

amshinde commented Oct 16, 2018

Go program that implements the above logic:

package main

import (
	"fmt"
	"github.com/vishvananda/netlink"
	"os"
	"strconv"

	"golang.org/x/sys/unix"
)

func main() {
	args := os.Args[1:]

	if len(args) != 2 {
		fmt.Println("Incorrect number of args")
		os.Exit(1)
	}

	index1, _ := strconv.Atoi(args[0])
	index2, _ := strconv.Atoi(args[1])

	fmt.Printf("network index1 : %d\n", index1)
	fmt.Printf("network index2 : %d\n", index2)

	qdisc1 := &netlink.Ingress{
		QdiscAttrs: netlink.QdiscAttrs{
			LinkIndex: index1,
			Parent:    netlink.HANDLE_INGRESS,
		},
	}

	err := netlink.QdiscAdd(qdisc1)
	if err != nil {
		fmt.Printf("Failed to add qdisc for index %d : %s", index1, err)
		os.Exit(1)
	}

	qdisc2 := &netlink.Ingress{
		QdiscAttrs: netlink.QdiscAttrs{
			LinkIndex: index2,
			Parent:    netlink.HANDLE_INGRESS,
		},
	}

	err = netlink.QdiscAdd(qdisc2)
	if err != nil {
		fmt.Printf("Failed to add qdisc for index %d : %s", index2, err)
		os.Exit(1)
	}

	filter1 := &netlink.U32{
		FilterAttrs: netlink.FilterAttrs{
			LinkIndex: index1,
			Parent:    netlink.MakeHandle(0xffff, 0),
			Protocol:  unix.ETH_P_ALL,
		},
		Actions: []netlink.Action{
			&netlink.MirredAction{
				ActionAttrs: netlink.ActionAttrs{
					Action: netlink.TC_ACT_STOLEN,
				},
				MirredAction: netlink.TCA_EGRESS_REDIR,
				Ifindex:      index2,
			},
		},
	}

	if err := netlink.FilterAdd(filter1); err != nil {
		fmt.Printf("Failed to add filter for index %d : %s", index1, err)
		os.Exit(1)
	}

	filter2 := &netlink.U32{
		FilterAttrs: netlink.FilterAttrs{
			LinkIndex: index2,
			Parent:    netlink.MakeHandle(0xffff, 0),
			Protocol:  unix.ETH_P_ALL,
		},
		Actions: []netlink.Action{
			&netlink.MirredAction{
				ActionAttrs: netlink.ActionAttrs{
					Action: netlink.TC_ACT_STOLEN,
				},
				MirredAction: netlink.TCA_EGRESS_REDIR,
				Ifindex:      index1,
			},
		},
	}

	if err := netlink.FilterAdd(filter2); err != nil {
		fmt.Printf("Failed to add filter for index %d : %s", index2, err)
		os.Exit(1)
	}
}

@hellt
Copy link

hellt commented Feb 20, 2021

Hi, thank you for that nice trick.
Do you know if this tc based redirect will allow transparent forwarding of layer 2 frames which are filtered on linux bridge by default (like stp bpdu and LACP frames)?

@mcastelino
Copy link
Author

mcastelino commented Feb 22, 2021

Hi, thank you for that nice trick.
Do you know if this tc based redirect will allow transparent forwarding of layer 2 frames which are filtered on linux bridge by default (like stp bpdu and LACP frames)?

@hellt yes. All traffic should passthro. We use this in Kata containers will all types of CNI interfaces without issues. The performance drop is negleible.

@hellt
Copy link

hellt commented Feb 27, 2021

Thanks @mcastelino
that helped a lot in my similar case where I needed to connect qemu tap interfaces to a containers interfaces

@xzhao025
Copy link

thanks for sharing this! I was struggling with lacp over bridge, this helps!

@tamalsaha
Copy link

tamalsaha commented Sep 6, 2022

There is a CNI plugin for this https://github.com/awslabs/tc-redirect-tap used with firecracker

@mari0d
Copy link

mari0d commented Oct 7, 2022

@tamalsaha
Copy link

lol indeed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment