I am documenting how I installed zeek 3.0 on my Linux machine, which has 36 cores (72 with hyperthreading), using pfring to distribute the load. The OS is Ubuntu 18.04.3.
My monitoring interfaces are enp134s0f0
and enp216s0f0
.
Driver is i40e. This driver is supported by pfring, according to https://www.ntop.org/guides/pf_ring/zc.html.
zeek 3 supports OpenSSL's 1.1 API. You will still likely need to install libssl-dev:
apt install libssl1.0-dev
I used the development version of the master branch. Do not forget to use the --recursive
flag when cloning the repository.
I run ./zeek -i enp134s0f0
to see if anything is captured (yes). Don't forget to ifconfig up the interface first.
This is a very common problem. I use ethtool to deactivate all offloading of checksums on the NIC.
rx-checksumming: on
tx-checksumming: on
tx-checksum-ipv4: on
tx-checksum-ip-generic: off [fixed]
tx-checksum-ipv6: on
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: on
scatter-gather: on
tx-scatter-gather: on
tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
tx-tcp-segmentation: on
tx-tcp-ecn-segmentation: on
tx-tcp6-segmentation: on
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: on
receive-hashing: on
highdma: on
rx-vlan-filter: on
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: on
tx-ipip-segmentation: off [fixed]
tx-sit-segmentation: off [fixed]
tx-udp_tnl-segmentation: on
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
busy-poll: off [fixed]
hw-tc-offload: off [fixed]
Deactivating offloading is trial-and-error. The ethtool-internal names of each property can usually be inferred by taking the first letters of each word, e.g. gro for generic-receive-offload. There are exceptions, e.g.:
rx rx-checksumming
tx tx-checksumming
sg scatter-gather
Deactivating the higher-level properties seems to deactivate the nested ones as well. Some commands seem to switch off other segmentation offloading features at the same time, too.
Caveats for my NIC:
- ufo cannot be changed (but was set to off automatically)
- lro cannot be changed (but was set to off automatically)
This command will do the trick:
for NIC in enp216s0f0 enp134s0f0; do
for ANNOYING in rx tx sg tso gso gro; do ethtool -K $NIC $ANNOYING off; done
done
Now test with ./zeek -i enp134s0f0
.
Check conn.log
if we see full connections, i.e. the flags S...fF
.
Trying to build 7.4 as in its README: fails (good job). You need to build as described here: https://www.ntop.org/guides/pf_ring/get_started/git_installation.html
Hence, install pfring 7.4.0 from package (the github repo sees a lot of changes even to tagged versions, so you need to choose a release). Install as described here:
cd PF_RING/kernel
make
sudo make install
It needs to be a system-wide install; my attempt to use ./configure --prefix did not result in a build in a custom dir.
Try to insert the module:
insmod ./pf_ring.ko
I did not try yet to use the ZC drivers. pfring needs a custom libpcap to replace the system one:
cd PF_RING/userland/lib
./configure && make
sudo make install
cd ../libpcap
./configure && make
sudo make install
Then configure bro as described here: https://www.ntop.org/guides/pf_ring/thirdparty/bro.html
Use the following flags: ./configure --prefix=$HOME/bin/zeek --with-pcap=/usr/local/lib
I also tried with-openssl=$HOME/bin/openssl_1.2.0
, but that failed - zeek complains about an OpenSSL version <= 0.9.7. Apparently, however, there was a 1.0.2? on my system (probably from an earlier system-wide install).
The check with ldd
(and checking the timestamp of the file) yields that zeek uses the new libpcap:
ldd bro | grep pcap
Seems OK (very low):
ralph@ngara:~/bin/zeek/logs$ cat current/capture_loss.log | ../bin/bro-cut percent_lost
0.000573
0.000703
0.001729
0.000367
0.001342
0.001775
0.000962
But conn.log has this:
cat current/conn.log | ../bin/bro-cut history | sort | uniq -c | sort -rn | less
8784648 S
1353357 D
753330 Dd
752877 ^d
743630 ^hadf
718049 SAD
382076 SADF
346580 ShADadFf
200313 ShADadfF
141600 -
135312 FA
133080 ^hdaf
126887 DAF
120249 R
104406 Sr
89951 ^f
88091 ShADdaFf
76630 ^had
75983 ShADadFfR
That's too few ShADadFf
?
243258 possible_split_routing
238643 data_before_established
101396 inappropriate_FIN
These seem to point at a configuration problem in the switch.
- Check with ICT if routing is correct or not.
- Deactivate hashing? ethtool -K enp134s0f0 rh off