The eBPF (Extended Berkeley Packet Filter) language is a low-level assembly-like language that is specifically designed for writing programs that can be loaded into the Linux kernel. These programs are typically used for networking, security, and observability tasks.
eBPF has its own domain-specific language (DSL), following are some information about it.
-
Low-Level: The language is closer to assembly than to high-level languages like C.
-
Limited Instructions: eBPF has a limited set of instructions to ensure that programs are safe to run in the kernel space. This includes a lack of certain types of loops to prevent infinite loops in the kernel.
-
Type-Safe: eBPF is designed to be type-safe to prevent common programming errors that could crash or compromise the system.
-
JIT Compilation: eBPF programs are Just-In-Time (JIT) compiled into native machine code for performance.
-
Safety Checks: Before being loaded into the kernel, eBPF programs are verified for safety to ensure they don't perform illegal operations.
Note: eBPF programs run in a restricted environment with a limited set of instructions and are verified for safety before being loaded into the kernel, they offer a way to extend kernel functionality without compromising system stability or security.
Although one can write eBPF programs directly in its assembly-like DSL, it's more common and preferred to write them in a restricted subset of C, which is then compiled into eBPF bytecode using a specialized compiler (like LLVM with eBPF support).
Following is a simple example in C that could be compiled to eBPF bytecode:
DO NOT run this example on a remote machine.
#include <linux/bpf.h>
SEC("prog")
int dangerous_hello_world(void *ctx) {
return XDP_DROP; // Drop all packets
}
Code breakdown:
This program uses the XDP (eXpress Data Path) hook to drop all incoming packets. The program would be compiled to eBPF bytecode using a compiler with eBPF support.
Note: XDP is a high-performance, programmable network data path in the Linux kernel.
-
#include <linux/bpf.h>
: This line includes the header file for the eBPF library, which provides the necessary data structures and function prototypes. -
SEC("prog")
: This is an eBPF-specific macro that specifies the section name where this eBPF program will be placed. The section name is used when loading the program into the kernel. -
int dangerous_hello_world(void *ctx)
: This is the main function of the eBPF program. It takes a single argumentctx
, which is a pointer to the context containing the packet data and metadata. -
return XDP_DROP;
: This line specifies the action to be taken on the packet. In this case,XDP_DROP
means that the packet will be dropped, i.e., it won't be forwarded to its destination.
Here "packets" refer to network packets. When a packet arrives at a network interface (like an Ethernet port), the XDP framework can process it. This eBPF program is designed to drop all incoming packets, meaning they will not be processed further or forwarded to their intended destination.
In summary, this is a very basic eBPF program that drops all incoming network packets when loaded into the Linux kernel with the XDP framework.
The section names in eBPF programs, specified using the SEC
macro, are not arbitrary; they indicate the type of program you're writing and where it should be attached in the kernel. These section names are generally standardized, and they correspond to specific hooks or tracepoints where the eBPF program will be executed.
Here are some standard section names commonly used:
- "filter": For XDP (eXpress Data Path) programs that operate on network packets.
- "classifier": For tc (Traffic Control) programs that classify or modify network packets.
- "tracepoint/[subsystem]/[event]": For attaching to kernel tracepoints. The
[subsystem]
and[event]
are specific to what you want to trace. - "fentry/[function]" and "fexit/[function]": For attaching to the entry and exit points of kernel functions, respectively.
- "sockops": For socket-level operations.
- "cgroup/skb": For programs that operate on network packets and are attached to cgroups.
- "cgroup/sock": For programs that operate on sockets and are attached to cgroups.
While the section names are generally standardized, you can sometimes use custom section names if you're using a loading utility that allows for that. However, this is less common and usually not recommended unless you have a specific need for it.
The section to be used depends on:
- Type of Program: What you want the eBPF program to do (e.g., packet filtering, tracing, etc.).
- Attachment Point: Where in the kernel you want to attach the eBPF program (e.g., XDP for network packets, tracepoints for tracing, etc.).
For example, if you're writing an XDP program to filter network packets, you'd typically use the "filter" section. If you're writing a program to attach to a tracepoint for the sched_switch
event, you'd use "tracepoint/sched/sched_switch".
The section name helps the eBPF loader to understand where to place and how to attach your eBPF program in the kernel. Therefore, it's crucial to use the correct section name for your specific use-case.
Very cool docs, learn some new things :)
Its rare to find detailed info about
eBPF
.Have you ever played around with
SEC("socket")
orSEC("kprobe/tcp_data_queue")
?I'm having trouble with
SEC("socket")
as it does not fetch the whole node network, but only packets travel from or to the pod of theeBPF
program.While I wasn't able to use
SEC("kprobe/tcp_data_queue")
to print suspicious network packets, but instead only print ports and ips.