Inspired by Electrode, HoloFast is an Accord acceleration layer for HoloStore’s consensus library, implemented partly as a small fixed-frame eBPF protocol running in the Linux kernel. It lets the kernel take one outgoing Accord message and fan it out to multiple replicas, then watch replies and notify Rust only when enough replicas have responded for quorum. The important Accord decisions stay in Rust; eBPF only handles repetitive packet work like fan-out, fan-in, duplicate filtering, and steering.
Electrode saw up to 128.4% higher throughput and 41.7% lower latency by moving repetitive fan-out and quorum-waiting work into eBPF; HoloFast applies that idea to HoloStore’s Accord path.
Electrode is the closest prior system to HoloFast. It is an NSDI 2023 project that accelerates distributed protocols by moving narrow, repetitive networking operations into eBPF while keeping the actual distributed-protocol logic in user space. Its key observation is that the normal Linux networking stack is attractive because it preserves compatibility, security/isolation, and load-aware CPU behavior, but consensus protocols pay heavily for repeated user/kernel crossings and kernel networking-stack traversal when they broadcast phase messages and collect quorum replies.12
The most important lesson for HoloFast is not "put consensus in the kernel." The lesson is:
Keep correctness-critical protocol logic in user space.
Move only the mechanical packet work into eBPF.
Use fallback whenever packet state becomes too complex.
Electrode applied that model to classic Multi-Paxos using three eBPF optimizations: message broadcasting, fast acknowledging, and wait-on-quorums.3 These are directly relevant to HoloFast because HoloStore's Accord hot path has the same broad mechanical shape: send a phase message to a replica group, receive several replies, count a quorum, and advance the state machine.
Electrode reported substantial gains for Multi-Paxos. Under 3, 5, and 7 replicas, it improved maximum throughput by 34.9%, 104.8%, and 128.4%, respectively. The improvement grew with replica count because more replicas mean more leader-side broadcast messages, ACK messages, user/kernel crossings, and kernel-stack traversal to remove.4
For latency, Electrode reported 12.5%, 20.0%, and 25.6% lower median latency under 3, 5, and 7 replicas, respectively, and 11.8%, 24.7%, and 41.7% lower 99th-percentile tail latency. The paper attributes much of the latency reduction to fast acknowledging, which avoids two user/kernel crossings, two kernel-stack traversals, and one user-space wakeup on the follower side for each Multi-Paxos request.4
Electrode's optimization breakdown is especially useful for HoloFast. With 5 replicas, the paper reports that eBPF-based message broadcasting improved maximum throughput by 31.7%, fast acknowledging reduced median latency by 4.3%–12.7% before saturation, and wait-on-quorums improved maximum throughput by 57.7%. This division of labor is exactly how HoloFast should be evaluated: broadcast, quorum aggregation, and any future acknowledgment optimization must be benchmarked independently instead of being treated as one giant feature.5
The findings transfer because HoloStore/Accord has the same expensive network pattern:
coordinator -> broadcast phase request to replicas
replicas -> send phase responses
coordinator -> count enough responses for a quorum or fast-path condition
coordinator -> advance to the next phase or commit
The findings do not transfer perfectly because Accord is not classic Multi-Paxos. In HoloStore, a PreAcceptResponse carries dependency and sequence information, and Accord's fast path depends on whether dependencies converge. That means HoloFast should copy Electrode's message broadcasting and wait-on-quorums first, but it should not blindly copy Electrode's fast acknowledging for Accord PreAccept. Replica-side dependency computation belongs in Rust user space, not eBPF.
The safe mapping is:
| Electrode finding | HoloFast implementation consequence |
|---|---|
| Broadcasting is expensive and scales with replica count | Use TC egress cloning/rewrite so Rust sends one logical phase packet and eBPF fans it out. |
| Quorum fan-in creates avoidable wakeups | Use XDP ingress aggregation so duplicate/non-quorum replies do not always wake Rust. |
| Fast ACK improves follower-side latency in Multi-Paxos | Defer for Accord PreAccept; only consider bounded duplicate/stale ACK cases later. |
| Benefits grow with replica count | Benchmark 3, 5, and 7 replicas separately. |
| eBPF is modular | Ship HoloFast in stages: UDP fixed frame, TC broadcast, XDP quorum, conservative PreAccept candidate aggregation. |
| Complex cases need fallback | Pass large dependencies, recovery, oversized packets, map overflow, and parse failures to Rust/gRPC fallback. |
Electrode strongly supports the HoloFast design choice to replace gRPC only on the hot consensus path. eBPF cannot elegantly parse HTTP/2/gRPC/protobuf streams at XDP/TC hook points, so HoloFast needs a compact fixed-frame protocol whose header exposes exactly the fields eBPF needs: phase, kind, group ID, membership epoch, range generation, transaction ID, ballot, source replica, status, sequence, and dependency digest.
The core implementation principle should be:
Electrode offloaded packet mechanics.
HoloFast should offload packet mechanics.
Accord correctness stays in Rust.
So HoloFast should use eBPF for:
TC egress broadcast
XDP ingress response classification
quorum bitmaps
kernel-to-user quorum events
duplicate/stale reply filtering
optional packet steering
And HoloFast should keep these in Rust:
dependency calculation
PreAccept fast-path validation
Accept/Commit correctness
recovery
WAL durability
range split/merge correctness
state-machine execution
Electrode also compared its kernel-assisted approach with a kernel-bypassing baseline and found that pure kernel bypass can be faster, but at the cost of a more specialized operating model. That matters for HoloFast because cloud portability and operational simplicity are major design goals. The right first implementation is therefore not DPDK or a custom kernel driver; it is a fixed-frame HoloFast UDP transport plus eBPF/XDP/TC acceleration, with AF_XDP or provider-specific bypass left as optional future backends.6
This gives HoloFast the most useful part of Electrode's result: large reductions in avoidable kernel/user overhead while preserving the normal Linux deployment model and keeping the dangerous correctness-sensitive parts of Accord out of the kernel.
HoloStore is already a strong candidate for an Electrode-style optimization because its hot path has the same mechanical shape that the Electrode paper targets: a coordinator sends phase messages to replicas, waits for enough replies, and then advances the consensus state. HoloStore's README describes it as a strongly consistent key/value store built on Accord, using pre-accept -> accept -> commit, dependency tracking, fast-path behavior when dependencies converge, per-partition Accord groups, batched consensus RPCs, a WAL, and range generation IDs.7 HoloStore's current transport is a gRPC transport layer for Accord quorum rounds and read paths, with per-peer batching workers, bounded in-flight concurrency, and queue/latency counters.8 Its proto file explicitly defines hot consensus RPCs such as PreAcceptBatchV2, AcceptBatchV2, CommitBatchV2, and RecoverBatchV2, using packed unary batch envelopes to reduce protobuf wrapper overhead.9
The proposed project, HoloFast, does not turn HoloStore into an eBPF database. Instead, it makes HoloStore an eBPF-friendly Accord database:
Keep in Rust user space:
Accord safety and liveness logic
dependency calculation
fast-path vs slow-path decisions
recovery
WAL and durability
range split/merge correctness
state-machine execution
large command transfer
Move or assist with eBPF:
fixed-header packet parsing
TC egress broadcast
XDP ingress quorum aggregation
duplicate/stale packet suppression
kernel-to-user quorum events
optional CPU/NIC queue steering
metrics for fallback and packet path behavior
The design follows the Electrode paper's division of labor. Electrode found that distributed protocols under the normal Linux networking stack can spend substantial time on user/kernel crossings and kernel stack traversal; it offloaded message broadcasting, fast acknowledging, and wait-on-quorums to eBPF, reporting up to 128.4% higher throughput and 41.7% lower latency for Multi-Paxos.1 Electrode's Figure 1 and Section 4 identify three offloads: message broadcasting, fast acknowledging, and wait-on-quorums.3
For HoloStore/Accord, the safe initial mapping is:
| Electrode optimization | HoloFast mapping | Initial status |
|---|---|---|
| Message broadcasting | TC egress cloning/rewrite for PreAccept, Accept, and Commit requests |
Yes |
| Wait-on-quorums | XDP ingress response aggregation with bitmap counting and quorum events | Yes |
| Fast acknowledging | Mostly not for PreAccept; possible only for carefully bounded duplicate/late responses or later specialized cases |
Defer |
The important Accord-specific difference is that a PreAcceptResponse is not just a simple ACK. HoloStore's current proto shows PreAcceptResponse carrying ok, seq, deps, and promised; AcceptResponse carries ok and promised; CommitResponse carries ok.10 That means eBPF should not pretend to make Accord decisions. It should accelerate mechanical network work and emit candidate quorum events. Rust user space remains the final authority.
HoloStore currently makes gRPC calls for hot consensus phases. Even with packed v2 unary batches, the hot path still has layers that are not ideal for packet-level offload:
Accord phase event
-> Rust async task
-> batching queue
-> gRPC method
-> HTTP/2 framing
-> protobuf payload
-> socket send
-> kernel networking stack
-> peer kernel networking stack
-> gRPC/protobuf decode
-> Rust consensus handler
-> response path repeats in reverse
This is maintainable and portable, but it hides consensus structure from eBPF. At XDP/TC level, eBPF sees Ethernet/IP/UDP/TCP packets, not PreAcceptBatchV2 or AcceptBatchV2 method calls. That makes eBPF parsing fragile or impossible if gRPC remains the hot path.
Electrode's motivation maps directly to this. The paper argues that standard Linux networking has benefits, but protocol performance suffers from repeated user/kernel crossings and kernel stack traversal; in a five-replica leader-based Multi-Paxos path, the leader has to handle many sends/receives per request.2
HoloFast should optimize these mechanical costs:
- Fan-out: one logical Accord phase message must reach several replicas.
- Fan-in: the coordinator receives several replies but often needs only a quorum.
- Late/duplicate traffic: retransmissions, already-counted replies, wrong epoch, and stale range-generation packets should not always wake the Rust runtime.
- CPU locality: packets for the same Accord group should land on the same queue/core when possible.
- Tail latency: the Rust runtime should wake when the next useful consensus transition is available, not for every individual packet.
HoloFast should not attempt to move the following into eBPF:
dependency-set computation
Accord fast-path correctness decisions
ballot/recovery correctness
WAL append or fsync
range split/merge correctness
large command storage
linearizable read barriers
state-machine execution
membership changes
This is non-negotiable. eBPF's programming model is intentionally constrained. Electrode notes the difficulty caused by the verifier and the lack of dynamic memory allocation, and it keeps complex behavior in user space.11 The Linux verifier also enforces program safety, including restrictions around memory access and program behavior.12
No eBPF result is trusted as final consensus. eBPF may tell Rust:
"A candidate quorum appears to exist for txn X, phase Y, ballot B, epoch E."
Rust must still validate:
cluster epoch
range generation
membership/electorate
ballot
transaction ID
phase
quorum size
replica uniqueness
response contents
pre-accept dependency data
The primary win should come from doing fewer send/recv wakeups per Accord phase. Electrode's message-broadcasting offload uses bpf_clone_redirect() to clone and modify packets in kernel, so the application sends once and the kernel fans out.13 Electrode's wait-on-quorums offload maintains bitsets and forwards only quorum-relevant packets or events to user space.14
Every eBPF optimization must have a fallback:
parse failure -> XDP_PASS / TC pass
unknown version -> pass to user-space socket
map overflow -> pass and mark fallback counter
epoch mismatch -> pass or drop only if provably stale
large payload -> cold path
dependency overflow -> cold path
unexpected response -> cold path
verifier/load failure -> use HoloFast without eBPF or existing gRPC
The project should be built as an experiment with clear stages:
gRPC baseline
custom HoloFast UDP without eBPF
+ TC broadcast
+ XDP Accept/Commit quorum aggregation
+ bounded PreAccept candidate aggregation
+ optional CPU/AF_XDP steering
Each stage must be benchmarked separately. Electrode's paper breaks down the independent contribution of message broadcasting, fast acknowledging, and wait-on-quorums; HoloFast should copy that methodology.5
graph LR
Client[Redis/client frontend] --> Accord[Accord engine]
Accord --> Transport[Transport trait]
Transport --> Fast[HoloFast transport]
Transport --> Grpc[gRPC fallback/control]
Fast --> UDP[Fixed-frame UDP socket]
UDP --> TC[TC egress broadcast]
TC --> Net[Network]
Net --> XDP[XDP ingress parser]
XDP --> Quorum[XDP quorum aggregator]
Quorum --> Ring[BPF ringbuf quorum events]
Ring --> Runtime[HoloFast event loop]
Runtime --> Accord
Grpc --> Control[Membership, recovery, snapshots, large fetches]
HoloStore should split the transport into two lanes.
Keep gRPC for:
membership management
range split/merge/rebalance
snapshots
FetchCommand / large command bytes
recovery when fast path lacks data
admin/control APIs
compatibility with old nodes
fallback when eBPF is disabled
This is aligned with HoloStore's current proto, which includes both consensus hot methods and many control/range methods in the same gRPC service.9
Add a custom UDP-like fixed-frame lane for:
PreAccept request/response
Accept request/response
Commit request/response
small command inline payloads
quorum events
heartbeat/health pings if useful
The hot lane should be designed for eBPF parsing. No HTTP/2, no protobuf varints on fields eBPF needs, and no unbounded parsing.
The header must be:
fixed-size
8-byte aligned
network-byte-order or explicitly little-endian, but consistent
bounded and verifier-friendly
sufficient for eBPF to route/count/drop safely
versioned
protected by a cheap checksum or header CRC
The header should contain only fields eBPF needs. Variable data should follow after the header and should be parsed only by Rust unless it fits in a small bounded dependency section.
#[repr(C, packed)]
pub struct HoloFastHeader {
pub magic: u32, // 'HOLF'
pub version: u8, // protocol version
pub header_len: u8, // bytes
pub phase: u8, // PREACCEPT, ACCEPT, COMMIT, RECOVER, etc.
pub kind: u8, // REQUEST, RESPONSE, QUORUM_EVENT, NACK
pub flags: u16, // BROADCAST, INLINE_CMD, HAS_DEPS, FALLBACK, etc.
pub payload_len: u16, // bytes after header
pub header_crc32: u32, // optional, can be zero in early prototype
pub cluster_id: u64, // stable cluster fingerprint
pub membership_epoch: u64, // cluster/range membership version
pub range_generation: u64, // HoloStore range ownership epoch
pub group_id: u64, // Accord group/shard
pub from_node: u64,
pub to_node: u64, // 0 or broadcast marker for logical broadcast
pub coordinator_node: u64,
pub txn_origin_node: u64,
pub txn_counter: u64,
pub ballot_counter: u64,
pub ballot_node: u64,
pub seq: u64, // Accord sequence/timestamp component used by HoloStore
pub proposed_seq: u64, // optional phase-specific value; zero if unused
pub command_digest_hi: u64,
pub command_digest_lo: u64,
pub deps_digest_hi: u64,
pub deps_digest_lo: u64,
pub deps_count: u16,
pub status: u8, // ok/promised/reject class
pub quorum_class: u8, // simple, fast, recovery, commit-ack, etc.
pub reserved: u32,
}HoloStore's design notes say range generations disambiguate writes across split/merge cutovers and prevent late commands from an old range/group from overwriting newer child-range values.15 The HoloFast header should carry range_generation so eBPF and Rust can reject or fallback stale traffic early.
Accord fast-path behavior depends on dependency convergence. HoloStore's README says Accord fast path is possible when dependencies converge without conflicts.7 eBPF should not compute dependencies, but it can compare fixed-size digests. If a quorum of PreAcceptResponses agrees on (ok, seq, deps_digest, promised), eBPF can emit a candidate event. Rust must still verify the actual dependency set.
The payload should be phase-specific and length-bounded.
PREACCEPT_REQUEST payload:
optional inline command if <= HOLOFAST_INLINE_CMD_MAX
otherwise command_digest only, command bytes fetched through cold path
PREACCEPT_RESPONSE payload:
optional bounded normalized deps list
if deps list too large, set FALLBACK_REQUIRED and pass to user space
ACCEPT_REQUEST payload:
optional deps list or digest + cold fetch reference
optional command bytes if inline
ACCEPT_RESPONSE payload:
usually empty; header status/promised is enough
COMMIT_REQUEST payload:
optional command bytes or digest
committed seq/deps metadata
COMMIT_RESPONSE payload:
usually empty
The hot eBPF path should target single-packet messages. Electrode explicitly targets UDP protocols with application-level retransmission and notes that this works well for Paxos-style messages that are small enough to fit in one packet in datacenter environments.16
Recommended constants:
HOLOFAST_MTU_BUDGET = 1200 bytes initial safe target
HOLOFAST_INLINE_CMD_MAX = 512 bytes initial
HOLOFAST_INLINE_DEPS_MAX = 8 or 16 TxnIds initial
HOLOFAST_MAX_REPLICAS = 9 or 15 initial, compile-time bounded
HOLOFAST_MAX_INFLIGHT = configurable, e.g. 64k quorum states
If payload exceeds the hot-path budget, use:
HoloFast header + digest
FetchCommand or recovery over gRPC/control lane
HoloStore already has command_digest and has_command fields in its current AcceptRequest, CommitRequest, and RecoverResponse, so digest-based command fetch is already conceptually present.10
Hook: TC egress on the HoloStore network interface.
Purpose: turn one logical broadcast packet into N per-peer packets.
Input:
A HoloFast packet with:
flags contains BROADCAST
to_node = BROADCAST_MARKER
phase in {PREACCEPT, ACCEPT, COMMIT}
kind = REQUEST
Map reads:
GroupMembershipMap[(cluster_id, membership_epoch, group_id)] -> GroupMembership
NodeAddrMap[node_id] -> ethernet/IP/UDP destination
FeatureSwitchMap -> enabled/disabled per phase
Behavior:
1. Parse Ethernet/IP/UDP/HoloFast header.
2. Verify magic, version, length, and cluster_id.
3. Look up group membership by group_id and membership_epoch.
4. For every peer in the bounded membership list:
- clone packet
- rewrite destination MAC/IP/UDP port
- set to_node = peer_id
- update checksums
- bpf_clone_redirect()
5. Let the original packet either go to one peer or drop it after clones, depending on implementation.
6. Increment counters.
7. If any lookup fails, pass packet unchanged to user-space fallback path.
Electrode's message broadcasting design is the direct inspiration: it replaces repeated user-space sends with in-kernel packet clones and rewrites, reducing user/kernel crossings and stack traversal for one-to-many phase messages.13
Hook: XDP ingress.
Purpose: very quickly classify HoloFast packets.
Behavior:
1. Parse L2/L3/L4 headers with strict bounds checks.
2. Ignore non-HoloFast traffic with XDP_PASS.
3. Validate magic, version, header_len, payload_len.
4. Check feature switch and protocol version.
5. Dispatch by (kind, phase):
- RESPONSE + ACCEPT -> xdp_accept_quorum
- RESPONSE + COMMIT -> xdp_commit_quorum
- RESPONSE + PREACCEPT -> xdp_preaccept_candidate_quorum
- REQUEST paths -> pass to Rust handler unless future safe handler exists
6. Unknown or unsupported packets -> XDP_PASS.
XDP is appropriate because it runs on ingress packets early, before the expensive socket-buffer path in many cases; XDP programs can pass, drop, redirect, or manipulate packets.17
Hook: tail-called from dispatcher.
Purpose: count AcceptResponses and notify Rust when a quorum is reached.
Key:
#[repr(C)]
pub struct QuorumKey {
cluster_id: u64,
membership_epoch: u64,
range_generation: u64,
group_id: u64,
txn_origin_node: u64,
txn_counter: u64,
phase: u8,
ballot_counter: u64,
ballot_node: u64,
}State:
#[repr(C)]
pub struct QuorumState {
seen_bitmap: u64,
ok_bitmap: u64,
reject_bitmap: u64,
first_seen_ns: u64,
quorum_threshold: u8,
emitted: u8,
status: u8,
promised_counter: u64,
promised_node: u64,
}Behavior:
1. Verify from_node is a member of group_id at membership_epoch.
2. Compute bit = member_index(from_node).
3. If bit already seen, drop or pass depending on debug mode.
4. Set seen bit.
5. If status=OK, set ok bit.
6. If status=REJECT/PROMISED_HIGHER, set reject bit and either pass immediately or emit rejection event.
7. If popcount(ok_bitmap) >= simple_quorum_threshold and emitted=0:
- set emitted=1
- write QuorumEvent to BPF ringbuf
- optionally pass this packet to user space
8. If not quorum-reaching, drop packet in optimized mode or pass in debug mode.
Electrode's wait-on-quorums design uses bitsets rather than a simple counter to avoid double-counting duplicate ACKs, and forwards only quorum-reaching or overflow-relevant packets/events to user space.14
This is nearly identical to xdp_accept_quorum, but the payload is simpler because CommitResponse is only ok in the current HoloStore proto.10
Possible policies:
strict: wait for commit quorum event before completing coordinator future
relaxed: commit broadcast is fire-and-forget after WAL/consensus decision, only metrics are collected
hybrid: wait for quorum under debug/testing, fire-and-forget in production if Accord/HoloStore permits
This policy must be decided by HoloStore's existing correctness requirements. If commit replies are only liveness/diagnostic acknowledgments, they are an ideal eBPF aggregation target.
This is the most valuable but riskiest offload. It must be conservative.
PreAccept responses include dependency information, and Accord fast-path correctness depends on dependency/timestamp agreement. HoloStore's proto says PreAcceptResponse contains ok, seq, deps, and promised.10 The Accord whitepaper's consensus algorithm has PreAcceptOK responses containing timestamp/dependency data and uses fast-path criteria before skipping Accept.18
Only aggregate PreAccept responses when all of these are true:
ok == true
promised is empty/zero or equals expected ballot policy
deps_count <= HOLOFAST_INLINE_DEPS_MAX
deps payload is present in normalized fixed-bounded form
seq/proposed timestamp matches the candidate group
deps_digest matches candidate group
phase/ballot/epoch/range_generation all match
If any condition fails:
mark fallback_required for this txn
pass all future PreAccept responses for this txn to user space
let Rust perform normal Accord logic
#[repr(C)]
pub struct PreAcceptCandidateEvent {
pub key: QuorumKey,
pub seen_bitmap: u64,
pub ok_bitmap: u64,
pub seq: u64,
pub deps_digest_hi: u64,
pub deps_digest_lo: u64,
pub deps_count: u16,
pub deps_inline: [TxnIdWire; HOLOFAST_INLINE_DEPS_MAX],
pub event_flags: u32, // candidate_only, fallback_required, overflow, etc.
}Rust validation:
1. Look up the proposal future by txn_id.
2. Validate the event's epoch/generation/ballot/group/phase.
3. Validate that the bitmap satisfies the required Accord fast quorum/electorate.
4. Recompute deps_digest from inline deps.
5. Confirm that inline deps are complete for this fast-path case.
6. Only then advance the proposal as if those responses were received.
This gives eBPF a useful fast path for the common low-contention case where deps are empty or tiny. It avoids making eBPF responsible for computing or interpreting dependency sets.
After the basic offloads work, add packet steering:
group_id -> CPU queue / worker
coordinator_node -> queue
phase -> queue class
XDP can redirect packets using map-backed mechanisms such as CPUMAP and XSKMAP; XSKMAP can redirect frames to AF_XDP sockets without traversing the full network stack.19 This is a later optimization, not the first milestone.
FeatureSwitchMap
key: feature_id
value: enabled/version/config bits
NodeAddrMap
key: node_id
value: mac, ip, udp_port, queue_hint
GroupMembershipMap
key: cluster_id + membership_epoch + group_id
value: member_count, node_ids[], quorum_thresholds, fast_electorate bitmap
QuorumStateMap
key: QuorumKey
value: QuorumState
type: LRU hash or bounded hash
PreAcceptStateMap
key: QuorumKey
value: PreAcceptCandidateState
type: bounded hash
Ringbuf
producer: eBPF
consumer: Rust HoloFast event loop
CountersMap
key: counter_id
value: atomic counter
FallbackMap / DebugMap
key: reason
value: count
The BPF ring buffer is a good fit for kernel-to-user events; Linux's BPF ring buffer docs describe it as a mechanism for BPF programs to communicate event records to user space.20
HoloFast must not leak quorum states. Use three cleanup mechanisms:
1. User-space deletion after proposal completes.
2. TTL sweep in Rust: periodically delete old QuorumKey entries.
3. LRU map fallback: if state is evicted, eBPF passes packets to user space.
Overflow must be safe:
QuorumStateMap full -> pass packet, increment overflow counter
Ringbuf full -> pass packet, increment ringbuf_drop counter
deps too large -> pass packet, mark preaccept_fallback
membership missing -> pass packet
unknown replica -> pass packet or drop if strict anti-spoofing is enabled
Electrode also forwards messages to user space when fixed-size in-kernel structures cannot handle the case.21
Proposed layout:
crates/
holo_fast/
Cargo.toml
src/
lib.rs
wire.rs # shared wire structs and constants
encode.rs
decode.rs
transport.rs # FastTransport implementing Accord Transport
event_loop.rs # ringbuf consumer and proposal wakeups
membership.rs # BPF map sync
metrics.rs
fallback.rs
ebpf/
Cargo.toml # if using Aya
src/
main.rs # XDP/TC programs
maps.rs
parse.rs
quorum.rs
crates/holo_store/src/
transport.rs # keep GrpcTransport; add selection wrapper
fast_transport_adapter.rs # optional thin adapter
Alternative if using libbpf/CO-RE:
crates/holo_fast/bpf/
holo_fast.bpf.c
holo_fast.h
build.rs
The existing Accord engine already calls a transport abstraction for pre_accept, accept, commit, and recover based on the current GrpcTransport comments.8 Add a FastTransport implementation with the same semantic API.
Conceptual API:
#[async_trait]
impl Transport for FastTransport {
async fn pre_accept(
&self,
peer: NodeId,
req: PreAcceptRequest,
) -> anyhow::Result<PreAcceptResponse>;
async fn accept(
&self,
peer: NodeId,
req: AcceptRequest,
) -> anyhow::Result<AcceptResponse>;
async fn commit(
&self,
peer: NodeId,
req: CommitRequest,
) -> anyhow::Result<CommitResponse>;
async fn recover(
&self,
peer: NodeId,
req: RecoverRequest,
) -> anyhow::Result<RecoverResponse>;
}But the optimization becomes larger if HoloStore adds a group-aware broadcast API instead of calling per peer:
#[async_trait]
pub trait BroadcastTransport: Transport {
async fn pre_accept_group(
&self,
group: GroupId,
req: PreAcceptRequest,
quorum: QuorumSpec,
) -> anyhow::Result<PreAcceptQuorumResult>;
async fn accept_group(
&self,
group: GroupId,
req: AcceptRequest,
quorum: QuorumSpec,
) -> anyhow::Result<AcceptQuorumResult>;
async fn commit_group(
&self,
group: GroupId,
req: CommitRequest,
policy: CommitAckPolicy,
) -> anyhow::Result<CommitResult>;
}This is the cleaner interface. It exposes Accord's true operation — broadcast and wait for quorum — instead of hiding it behind per-peer RPCs.
Add a runtime selection enum:
pub enum ConsensusTransportKind {
Grpc,
HoloFastNoBpf,
HoloFastBpfBroadcast,
HoloFastBpfQuorum,
HoloFastBpfFull,
}Configuration:
HOLO_CONSENSUS_TRANSPORT=grpc|holofast|holofast-bpf
HOLO_FAST_ENABLE_TC_BROADCAST=true|false
HOLO_FAST_ENABLE_XDP_QUORUM=true|false
HOLO_FAST_ENABLE_PREACCEPT_CANDIDATE=true|false
HOLO_FAST_DEBUG_PASS_ALL=true|false
HOLO_FAST_INLINE_CMD_MAX=512
HOLO_FAST_INLINE_DEPS_MAX=8
HOLO_FAST_MAX_INFLIGHT=65536
Rust must consume ringbuf events and wake proposal futures.
pub enum HoloFastEvent {
AcceptQuorum(AcceptQuorumEvent),
CommitQuorum(CommitQuorumEvent),
PreAcceptCandidate(PreAcceptCandidateEvent),
Fallback(FallbackEvent),
}Event loop pseudocode:
loop {
let event = ringbuf.poll(timeout)?;
match event {
HoloFastEvent::AcceptQuorum(e) => {
if validate_accept_event(&e, membership, proposals) {
proposals.complete_accept_quorum(e.key, e.ok_bitmap, e.promised);
} else {
metrics.invalid_bpf_event += 1;
proposals.force_fallback(e.key);
}
}
HoloFastEvent::CommitQuorum(e) => { ... }
HoloFastEvent::PreAcceptCandidate(e) => {
if validate_preaccept_candidate(&e, membership, proposals) {
proposals.complete_preaccept_candidate(e);
} else {
proposals.force_fallback(e.key);
}
}
HoloFastEvent::Fallback(e) => {
proposals.force_fallback(e.key);
}
}
}When HoloStore commits a membership/range change:
1. Rust updates normal HoloStore membership state.
2. Rust builds a new GroupMembershipMap entry with new membership_epoch.
3. Rust loads node address records into NodeAddrMap.
4. Rust flips FeatureSwitchMap for the new epoch.
5. Old epoch remains for a grace period.
6. After outstanding proposals expire, Rust removes old epoch map entries.
Never let eBPF invent membership. It only reads a Rust-owned map.
FastTransport should retain a normal socket receive path. If eBPF passes a packet, Rust decodes and handles it as a normal HoloFast packet. If HoloFast fails entirely, the transport falls back to gRPC.
Fallback triggers:
large command
large dependency list
missing BPF map entry
quorum map overflow
ringbuf overflow
packet parse error
unsupported version
preaccept mismatch
recovery path
membership transition
network loss/reordering beyond hot-path assumptions
Accord/HoloStore semantics:
coordinator sends PreAccept to participating replicas/electorate
replicas compute conflict/dependency/timestamp information
coordinator may fast-path if responses converge
otherwise coordinator moves to Accept
HoloFast behavior:
TC broadcast:
yes, strong fit
XDP quorum aggregation:
yes, but candidate-only and conservative
Fast ACK:
no initial support; unsafe because response requires actual dependency computation
Fast-path case:
1. Rust serializes one PreAccept request with BROADCAST flag.
2. TC clones it to all fast-electorate members.
3. Replicas process in Rust and send PreAcceptResponse with fixed header.
4. Coordinator XDP sees responses.
5. If responses match compact fast-path conditions, XDP emits PreAcceptCandidateEvent.
6. Rust validates and advances to Commit or Accept as Accord requires.
7. If mismatch/overflow, Rust receives normal responses and executes existing logic.
Accord/HoloStore semantics:
coordinator asks replicas to accept the chosen seq/deps/ballot
replicas respond ok or promised-higher
coordinator waits for simple quorum
HoloFast behavior:
TC broadcast:
yes
XDP quorum aggregation:
yes, strong fit
Fast ACK:
no initial support; user-space replica must still update protocol state
This phase is easier than PreAccept because AcceptResponse is compact.
Accord/HoloStore semantics:
coordinator disseminates decided commit
replicas append/record/apply according to HoloStore's durability/execution design
HoloStore's design notes say the WAL is the authoritative durability source and commits are appended before apply; commit-log appends are batched to amortize syscalls and fsync.15
HoloFast behavior:
TC broadcast:
yes, strong fit
XDP quorum aggregation:
yes, if commit ACKs are required
Fast ACK:
no if ACK implies durable commit; maybe no ACK needed depending on current semantics
Keep recovery on gRPC/control lane initially.
Reason:
variable state
large command transfer
ballot edge cases
must handle old/incomplete transactions
higher correctness risk
Electrode similarly keeps complex, uncommon cases in user space rather than forcing them into eBPF.11
Before changing transport, collect evidence.
Metrics:
transport queue wait
per-phase p50/p95/p99/p999 latency
send/recv syscall rate
context switches
softirq CPU
kernel vs user CPU
protobuf/gRPC encode/decode CPU
batch size distribution
in-flight requests per lane
WAL latency
dependency conflict rate
fast-path success rate
Tools:
existing HOLOSTATS / HOLOMETRICS
eBPF tracing for syscalls and sched wakeups
perf/flamegraph
packet counters
runtime task latency counters
Exit criteria:
clear evidence that consensus network transport overhead is material
or clear evidence that WAL/storage/dependency conflicts dominate and transport work should wait
Build the fixed-frame UDP transport first. This answers the question: how much is gRPC/protobuf/HTTP2 costing by itself?
Tasks:
1. Add holo_fast::wire structs.
2. Add encode/decode with exhaustive bounds checks.
3. Add UDP socket send/recv path.
4. Implement per-peer sends matching existing Transport trait.
5. Keep all batching in Rust.
6. Run linearizability tests.
7. Benchmark vs gRPC.
Exit criteria:
HoloFastNoBpf is correct under local 3-node tests
HoloFastNoBpf can run existing benchmarks
fallback to gRPC works
Tasks:
1. Add group-aware broadcast API.
2. Add BPF maps for node addresses and group membership.
3. Load TC egress program.
4. Send one BROADCAST packet per phase.
5. Verify every replica receives exactly one packet.
6. Test with 3/5/7 replicas.
Exit criteria:
send syscall count drops for broadcast phases
correctness unchanged
packet loss/retransmission handled by user space
Tasks:
1. Add QuorumKey and QuorumState maps.
2. Add ringbuf events.
3. Aggregate AcceptResponse and CommitResponse.
4. Drop duplicate/non-quorum ACKs in optimized mode.
5. Add debug mode that passes all packets while still producing events.
6. Compare results between debug and optimized modes.
Exit criteria:
receive wakeups drop for Accept/Commit replies
quorum events match user-space counted quorum in debug mode
linearizability passes with aggregation enabled
Tasks:
1. Add normalized bounded deps encoding.
2. Add deps_digest computation shared by Rust and BPF-visible header.
3. Aggregate only matching compact PreAccept responses.
4. Fallback on mismatch, overflow, reject, higher promised ballot, or missing deps.
5. Verify against user-space shadow counter in debug mode.
Exit criteria:
zero divergence between BPF candidate and Rust shadow validation
fast-path low-contention workloads show fewer wakeups
contention workloads safely fallback
Tasks:
1. Add group_id -> queue/core policy.
2. Evaluate CPUMAP or XSKMAP routing.
3. Optionally add AF_XDP sockets for hot HoloFast packets.
4. Compare against XDP+normal socket mode.
AF_XDP can provide partial or full kernel bypass for selected packets, but it increases complexity and changes the operational model.22
Only after the first five phases are stable.
Possible future offloads:
duplicate Accept/Commit ACK for already-seen retransmissions
stale epoch/range-generation drop
fast NACK for obviously old ballot if Rust has synced promised ballot state
Do not initially implement PreAccept fast acknowledging. Accord dependency computation belongs in Rust.
Rust should be able to ignore every eBPF event and still make progress through fallback. This protects correctness if:
BPF program unloads
map is full
ringbuf event is dropped
kernel lacks required helper
packet format changes
membership changes mid-flight
Add HOLO_FAST_DEBUG_PASS_ALL=true:
XDP produces quorum events
XDP does not drop any HoloFast replies
Rust receives all replies normally
Rust compares its normal quorum result with the BPF event
mismatches panic in test / disable feature in production
Run this mode in CI/integration tests before optimized dropping is allowed.
Test corpus:
valid packets for every phase
short Ethernet/IP/UDP frames
bad header_len
bad payload_len
unknown version
wrong magic
unsupported phase
wrong epoch
wrong range_generation
spoofed from_node
duplicate replies
out-of-order replies
large deps overflow
map overflow
ringbuf overflow
Keep using HoloStore's existing Porcupine linearizability check. The README already includes make check-linearizability and a configurable script for workload testing.7
Add specific tests:
3-node no loss
5-node no loss
7-node no loss
single node failure
leader/coordinator failure mid-phase
membership epoch change while proposals in flight
range split during outstanding packets
packet duplication
packet reordering
packet loss
large command fallback
large dependency fallback
mixed gRPC and HoloFast nodes if compatibility is required
For every eBPF drop decision, answer:
Can dropping this packet hide information Rust needs for safety?
Can dropping this packet prevent liveness without timeout fallback?
Can a duplicate packet be mistaken for a unique replica vote?
Can a stale epoch packet be counted for a new epoch?
Can a stale range-generation packet affect a new range?
Can a malicious or buggy node spoof from_node?
Can map eviction create a false quorum?
Can ringbuf loss create a false completion?
False quorum must be impossible. Lost event is acceptable if fallback/timeout handles it.
Benchmark at least:
A. gRPC current baseline
B. HoloFast UDP, no eBPF
C. HoloFast + TC broadcast
D. HoloFast + TC broadcast + XDP Accept/Commit quorum
E. HoloFast + bounded PreAccept candidate aggregation
F. Optional AF_XDP / packet steering
3 replicas
5 replicas
7 replicas
Electrode's improvements grew with replica count because broadcast and quorum fan-in costs grow with the number of replicas.4
low-contention SET
hot-key SET with high dependency conflicts
mixed GET/SET
small values
large values requiring command fetch fallback
high pipeline Redis benchmark
single client latency-sensitive workload
many-client throughput workload
failure/retransmission workload
range split/merge workload if implemented
throughput ops/s
p50/p95/p99/p999 latency
kernel CPU
user CPU
softirq CPU
send syscalls/s
recv syscalls/s
context switches/s
packets passed/dropped/redirected by XDP
TC clone count
ringbuf events/s
ringbuf drops
quorum map occupancy
fallback ratio by reason
fast-path success ratio
WAL latency distribution
Do not assume Electrode's exact gains will transfer. HoloStore uses Accord, not Multi-Paxos; it has dependency computation and WAL behavior. But the hypothesis is:
TC broadcast should improve throughput as replica count increases.
XDP Accept/Commit aggregation should reduce coordinator wakeups.
PreAccept candidate aggregation should help low-contention, small-deps workloads.
Transport wins will be smaller when WAL fsync, storage, or dependency conflicts dominate.
| Option | Pros | Cons |
|---|---|---|
| Keep gRPC everywhere | Simple, portable, mature, easier debugging | Poor fit for packet-level eBPF, HTTP/2/protobuf overhead remains, hard to clone/aggregate phase packets |
| Split hot/cold transport | eBPF can parse fixed headers, easier broadcast/quorum offload, keeps gRPC for complex paths | More code, compatibility matrix, new packet format, new operational/debug burden |
Recommendation: split transports. Keep gRPC for control/recovery and use HoloFast only for hot consensus phases.
| Option | Pros | Cons |
|---|---|---|
| UDP/fixed datagrams | eBPF-friendly, single-packet parse, easy XDP/TC logic, matches Electrode model | Must implement retransmission, duplicate handling, path MTU discipline, loss/reordering behavior |
| TCP/custom framing | Reliable stream, familiar operational behavior | Harder for XDP to parse message boundaries, packet fragmentation/coalescing complicates eBPF, less Electrode-like |
Recommendation: use UDP-like datagrams for the hot lane with application-level retransmission. Keep recovery and large data on gRPC.
| Option | Pros | Cons |
|---|---|---|
| Rust-only | Simpler, easier correctness reasoning | Every reply wakes user space; cost grows with replicas |
| eBPF aggregation | Fewer wakeups, fewer kernel/user crossings, better fit for high replica counts | BPF map complexity, verifier constraints, fallback paths required |
Recommendation: start with Accept/Commit aggregation, then add conservative PreAccept candidate aggregation.
| Option | Pros | Cons |
|---|---|---|
| Pass all replies | Easiest correctness debugging | Smaller performance gain |
| Drop duplicate/non-quorum replies | Maximum wakeup reduction | Must prove Rust does not need the dropped packets |
Recommendation: implement debug shadow mode first. In production, drop only for phases where header data is sufficient, such as duplicate ACKs and compact Accept/Commit responses. For PreAccept, drop only after the event contains all required bounded data or use fallback.
| Option | Pros | Cons |
|---|---|---|
| Digest only | Very easy for eBPF to compare | Rust may not have actual dependencies if packets were dropped |
| Bounded inline deps | Rust can validate compact fast-path event | Needs fixed max, fallback when deps are large |
| Full variable deps | More complete | Bad fit for eBPF verifier and bounded parsing |
Recommendation: use digest + bounded inline normalized deps. Fallback when deps exceed the bound.
| Option | Pros | Cons |
|---|---|---|
| Aya | Rust-native, nice integration with HoloStore, shared Rust structs easier | Production CO-RE/kernel compatibility may require careful testing |
| libbpf-rs + C eBPF | Mature CO-RE workflow, closer to many production BPF examples | Mixed Rust/C build, more FFI/build complexity |
Recommendation: prototype with Aya if Rust-native elegance matters most; use libbpf-rs/CO-RE if deployment stability across kernels becomes the priority.
| Option | Pros | Cons |
|---|---|---|
| XDP/TC + normal sockets | Kernel-native, no busy polling, simpler ops, closest to Electrode | Not as fast as kernel bypass |
| AF_XDP | More performance potential, partial kernel bypass | More complex socket/queue management |
| DPDK | Highest ceiling | Busy polling, dedicated cores, custom network stack burden |
Electrode compares its kernel-native approach with kernel bypass and notes the tradeoff: eBPF retains kernel-networking benefits but does not beat pure kernel bypass in raw performance.6
Recommendation: implement XDP/TC first. Consider AF_XDP only after measurements show normal sockets remain the bottleneck.
| Option | Pros | Cons |
|---|---|---|
| Copy Electrode fast ACK directly | Potential latency win | Unsafe for Accord PreAccept because dependencies must be computed by the replica |
| Do not fast ACK | Correctness is simpler | Leaves some latency win unrealized |
| Fast ACK only special cases | Some win for duplicates/stale messages | Requires careful state sync from Rust to BPF |
Recommendation: do not implement general fast ACK at first. Add only duplicate/stale special cases after the rest is validated.
eBPF program loading usually requires elevated privileges or specific capabilities. HoloStore should support:
--transport grpc
--transport holofast
--transport holofast-bpf
If eBPF load fails:
log exact reason
continue with HoloFastNoBpf or gRPC depending on config
export metric holo_fast_bpf_load_failed=1
Use strict wire versioning:
magic = HOLF
version = 1
feature bitmap
min/max supported version in node handshake
Unknown versions must pass to user space or fall back to gRPC.
Add metrics:
holo_fast_packets_rx_total{phase,kind}
holo_fast_packets_tx_total{phase,kind}
holo_fast_bpf_broadcast_clones_total
holo_fast_bpf_quorum_events_total{phase}
holo_fast_bpf_dropped_replies_total{phase,reason}
holo_fast_bpf_fallback_total{reason}
holo_fast_bpf_map_overflow_total{map}
holo_fast_preaccept_candidate_total
holo_fast_preaccept_candidate_rejected_total{reason}
holo_fast_transport_mode
Feature flags should allow rollback without rebuild:
Disable PreAccept candidate aggregation.
Disable XDP quorum aggregation.
Disable TC broadcast.
Disable all eBPF but keep HoloFast UDP.
Disable HoloFast and use gRPC.
| Risk | Mitigation |
|---|---|
| False quorum event | Rust validation; shadow mode; bitset per replica; membership epoch in key |
| Dropping data Rust needs | Drop only when event contains sufficient data; PreAccept conservative fallback |
| Membership/range race | Include membership epoch and range generation in every packet and map key |
| Map overflow | Pass to user space; bounded inflight; metrics and alerts |
| Ringbuf overflow | Pass packet; event loss never completes proposal by itself |
| Packet spoofing | Validate node address/member mapping; optionally authenticate packets later |
| Kernel incompatibility | Version-gate helpers; fallback to no-BPF/gRPC |
| Debuggability | Debug pass-all mode; packet trace logs; per-reason counters |
| Performance regression | Stage-by-stage benchmarks; feature flags |
| Complexity creep | Keep recovery/WAL/deps computation out of eBPF |
Deliverables:
wire structs
encode/decode
fuzz tests
packet parser tests
feature flags
no runtime behavior change
Deliverables:
FastTransport implementing existing Transport trait
UDP send/recv
timeouts/retransmission
gRPC fallback
benchmark mode
linearizability pass
Deliverables:
BroadcastTransport trait
Accord coordinator uses group broadcast when available
fallback adapter sends per peer over old Transport
no eBPF yet
Deliverables:
BPF loader
membership maps
TC program
clone counters
debug packet capture test
Deliverables:
XDP dispatcher
quorum maps
ringbuf event loop
shadow validation mode
optimized drop mode behind flag
Deliverables:
deps digest
bounded normalized deps encoding
candidate quorum event
fallback reasons
shadow validation
The project is successful if:
Correctness:
all existing linearizability tests pass
failure/retransmission tests pass
shadow mode shows zero BPF/Rust quorum divergence
Performance:
send syscall count drops for broadcast phases
recv wakeups drop for Accept/Commit quorum phases
p99 latency improves on low-contention small-message workloads
throughput improves as replica count grows
Operability:
feature flags can disable each optimization
fallback is automatic and visible in metrics
gRPC control/recovery lane remains available
The project is not successful if the only measurable bottleneck is WAL/fsync/storage/dependency conflict rate. In that case, the custom transport may still be useful, but eBPF should not be prioritized until the actual bottleneck moves back to networking.
Footnotes
-
Yang Zhou, Zezhou Wang, Sowmya Dharanipragada, and Minlan Yu, "Electrode: Accelerating Distributed Protocols with eBPF," NSDI 2023. The abstract states that Electrode executes optimizations in the kernel before the networking stack and reports up to 128.4% throughput improvement and 41.7% latency improvement for classic Multi-Paxos. Paper: https://www.usenix.org/system/files/nsdi23-zhou.pdf ↩ ↩2
-
Electrode paper, Section 1 and Table 1, discussing overhead from user/kernel crossings and kernel networking stack traversal in Paxos deployments. https://www.usenix.org/system/files/nsdi23-zhou.pdf ↩ ↩2
-
Electrode paper, Figure 1 and Section 4, describing offloads for message broadcasting, fast acknowledging, and wait-on-quorums. https://www.usenix.org/system/files/nsdi23-zhou.pdf ↩ ↩2
-
Electrode paper, Section 7.1, reporting larger throughput improvements as replica count increases. https://www.usenix.org/system/files/nsdi23-zhou.pdf ↩ ↩2 ↩3
-
Electrode paper, Section 7.2 and Figure 6, breaking down message broadcasting, fast acknowledging, and wait-on-quorums contributions. https://www.usenix.org/system/files/nsdi23-zhou.pdf ↩ ↩2
-
Electrode paper, Section 7.5, comparing Electrode with kernel bypass and discussing the performance/operational tradeoff. https://www.usenix.org/system/files/nsdi23-zhou.pdf ↩ ↩2
-
HoloStore README, describing HoloStore as a strongly consistent key/value store built on Accord with pre-accept, accept, commit, dependency tracking, fast path, per-partition Accord groups, WAL, range generation IDs, Redis-compatible benchmarking, and linearizability checks. https://github.com/kellabyte/holostore ↩ ↩2 ↩3
-
HoloStore
transport.rs, whose top-level comments describe the current gRPC transport layer for Accord quorum rounds/read paths, per-peer batching workers, bounded in-flight RPC concurrency, and telemetry. https://raw.githubusercontent.com/kellabyte/holostore/main/crates/holo_store/src/transport.rs ↩ ↩2 -
HoloStore
holo.proto, which defines the gRPC service and packed unary v2 batch methods for hot consensus/read paths. https://github.com/kellabyte/holostore/blob/main/crates/holo_store/proto/holo.proto ↩ ↩2 -
HoloStore
holo.proto, definitions forPreAcceptRequest,PreAcceptResponse,AcceptRequest,AcceptResponse,CommitRequest,CommitResponse, andRecoverResponse. https://github.com/kellabyte/holostore/blob/main/crates/holo_store/proto/holo.proto ↩ ↩2 ↩3 ↩4 -
Electrode paper, Sections 3, 4, and 8, discussing eBPF verifier constraints, bounded/static memory behavior, and the division between simple fast-path operations in kernel and complex protocol behavior in user space. https://www.usenix.org/system/files/nsdi23-zhou.pdf ↩ ↩2
-
Linux kernel documentation, eBPF verifier, describing safety checks for eBPF programs. https://docs.kernel.org/bpf/verifier.html ↩
-
Electrode paper, Section 4.1, message broadcasting in TC using
bpf_clone_redirect()to clone and rewrite packets in kernel. https://www.usenix.org/system/files/nsdi23-zhou.pdf ↩ ↩2 -
Electrode paper, Section 4.3, wait-on-quorums using eBPF-maintained bitsets and forwarding quorum-reaching packets/events. https://www.usenix.org/system/files/nsdi23-zhou.pdf ↩ ↩2
-
HoloStore design notes, describing the separation of consensus, WAL/durability, execution, batching, and range generations. https://raw.githubusercontent.com/kellabyte/holostore/main/crates/holo_store/docs/DESIGN.md ↩ ↩2
-
Electrode paper, Section 1, noting the prototype targets UDP protocols, uses application-level retransmission, and fits small Paxos messages/datacenter conditions. https://www.usenix.org/system/files/nsdi23-zhou.pdf ↩
-
docs.ebpf.io,
BPF_PROG_TYPE_XDP, describing XDP programs attached to network devices and packet actions such as pass, drop, redirect, and manipulation. https://docs.ebpf.io/linux/program-type/BPF_PROG_TYPE_XDP/ ↩ -
Accord CEP-15 draft whitepaper, Sections 3.1 and 3.2, describing PreAccept, Accept, Commit, Execute, Apply, fast path, timestamps, and dependency responses. https://cwiki.apache.org/confluence/download/attachments/188744725/Accord.pdf ↩
-
Linux kernel documentation,
BPF_MAP_TYPE_XSKMAP, describing XDP redirection of raw frames to AF_XDP sockets. https://docs.kernel.org/bpf/map_xskmap.html ↩ -
Linux kernel documentation, BPF ring buffer, describing the ring buffer design/API for BPF-to-user-space event communication. https://www.kernel.org/doc/html/next/bpf/ringbuf.html ↩
-
Electrode paper, Sections 4.2 and 4.3, describing fixed-size in-kernel structures and fallback/forwarding to user space when the eBPF path cannot handle a case. https://www.usenix.org/system/files/nsdi23-zhou.pdf ↩
-
docs.ebpf.io, AF_XDP, describing AF_XDP sockets and partial/full kernel bypass with XDP. https://docs.ebpf.io/linux/concepts/af_xdp/ ↩