Skip to content

Instantly share code, notes, and snippets.

@ralphholz
Last active January 16, 2020 21:23
Show Gist options
  • Save ralphholz/1468cfa78c6af5f61eff13fab406a4ee to your computer and use it in GitHub Desktop.
Save ralphholz/1468cfa78c6af5f61eff13fab406a4ee to your computer and use it in GitHub Desktop.

Installing zeek (bro) on Linux in single-host cluster mode with pfring

I am documenting how I installed zeek (bro) on my Linux machine, which has 36 cores (72 with hyperthreading), using pfring to distribute the load.

NICs and drivers

My monitoring interfaces are enp134s0f0 and enp216s0f0. Driver is i40e. This driver is supported by pfring, according to https://www.ntop.org/guides/pf_ring/zc.html.

Install openssl

zeek does not yet support OpenSSL's 1.1 API, so we need an older openssl than shipped with Ubuntu 18.0.4.1:

make -j 32
make test
make install

Compile and install zeek as described in README

I used the development version of the master branch. Do not forget to use the --recursive flag when cloning the repository.

Test zeek

I run ./bro -i enp134s0f0 to see if anything is captured (yes). Don't forget to ifconfig up the interface first.

Problem 1: checksumming is offloaded

This is a very common problem. I use ethtool to deactivate all offloading of checksums on the NIC.

rx-checksumming: on
tx-checksumming: on
        tx-checksum-ipv4: on
        tx-checksum-ip-generic: off [fixed]
        tx-checksum-ipv6: on
        tx-checksum-fcoe-crc: off [fixed]
        tx-checksum-sctp: on
scatter-gather: on
        tx-scatter-gather: on
        tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
        tx-tcp-segmentation: on
        tx-tcp-ecn-segmentation: on
        tx-tcp6-segmentation: on
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: on
receive-hashing: on
highdma: on
rx-vlan-filter: on
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: on
tx-ipip-segmentation: off [fixed]
tx-sit-segmentation: off [fixed]
tx-udp_tnl-segmentation: on
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
busy-poll: off [fixed]
hw-tc-offload: off [fixed]

Deactivating offloading is trial-and-error. The ethtool-internal names of each property can usually be inferred by taking the first letters of each word, e.g. gro for generic-receive-offload. There are exceptions, e.g.:

rx  rx-checksumming
tx  tx-checksumming
sg  scatter-gather

Deactivating the higher-level properties seems to deactivate the nested ones as well. Some commands seem to switch off other segmentation offloading features at the same time, too.

Caveats for my NIC:

  • ufo cannot be changed (but was set to off automatically)
  • lro cannot be changed (but was set to off automatically)

This command will do the trick:

for NIC in enp216s0f0 enp134s0f0; do
    for ANNOYING in rx tx sg tso gso gro; do ethtool -K $NIC $ANNOYING off; done
done

Now test with ./bro -i enp134s0f0.

Check conn.log if we see full connections, i.e. the flags S...fF.

Trying pfring

Trying to build 7.4 as in its README: fails (good job). You need to build as described here: https://www.ntop.org/guides/pf_ring/get_started/git_installation.html

Hence, install pfring 7.4.0 from package (the github repo sees a lot of changes even to tagged versions, so you need to choose a release). Install as described here:

cd PF_RING/kernel
make
sudo make install

It needs to be a system-wide install; my attempt to use ./configure --prefix did not result in a build in a custom dir.

Try to insert the module:

insmod ./pf_ring.ko

I did not try yet to use the ZC drivers. pfring needs a custom libpcap to replace the system one:

cd PF_RING/userland/lib
./configure && make
sudo make install
cd ../libpcap
./configure && make
sudo make install

Then configure bro as described here: https://www.ntop.org/guides/pf_ring/thirdparty/bro.html

Recompile zeek

Use the following flags: ./configure --prefix=$HOME/bin/zeek --with-pcap=/usr/local/lib

I also tried with-openssl=$HOME/bin/openssl_1.2.0, but that failed - zeek complains about an OpenSSL version <= 0.9.7. Apparently, however, there was a 1.0.2? on my system (probably from an earlier system-wide install).

The check with ldd (and checking the timestamp of the file) yields that zeek uses the new libpcap:

ldd bro | grep pcap

packet loss in capture_loss.log / conn.log

Seems OK (very low):

ralph@ngara:~/bin/zeek/logs$ cat current/capture_loss.log | ../bin/bro-cut percent_lost
0.000573
0.000703
0.001729
0.000367
0.001342
0.001775
0.000962

But conn.log has this:

cat current/conn.log | ../bin/bro-cut history | sort | uniq -c | sort -rn | less
8784648 S
1353357 D
753330 Dd
752877 ^d
743630 ^hadf
718049 SAD
382076 SADF
346580 ShADadFf
200313 ShADadfF
141600 -
135312 FA
133080 ^hdaf
126887 DAF
120249 R
104406 Sr
 89951 ^f
 88091 ShADdaFf
 76630 ^had
 75983 ShADadFfR

That's too few ShADadFf?

5) weird.log

 243258 possible_split_routing
 238643 data_before_established
 101396 inappropriate_FIN

These seem to point at a configuration problem in the switch.

  • Check with ICT if routing is correct or not.
  • Deactivate hashing? ethtool -K enp134s0f0 rh off
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment