Skip to content

Instantly share code, notes, and snippets.

@nickwallen
Last active May 10, 2017 22:14
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save nickwallen/46c30a93a3eac1da0174e84125f49d33 to your computer and use it in GitHub Desktop.
Save nickwallen/46c30a93a3eac1da0174e84125f49d33 to your computer and use it in GitHub Desktop.

Test1

Create a new Topic

[root@y136 ~]# kafka-topics.sh --create --topic pcap-nick-v2 --partitions 12 --replication-factor 1 --zookeeper y113:2181
Created topic "pcap-nick-v2".

Setup ACLs for the new Topic

[root@y136 ~]# kafka-acls.sh \
>     --authorizer kafka.security.auth.SimpleAclAuthorizer \
>     -authorizer-properties zookeeper.connect=y113:2181 \
>     --add \
>     --allow-principal User:metron \
>     --topic pcap-nick-v2 \
>     --group metron
Adding ACLs for resource `Topic:pcap-nick-v2`:
 	User:metron has Allow permission for operations: All from hosts: *

Adding ACLs for resource `Group:metron`:
 	User:metron has Allow permission for operations: All from hosts: *

Current ACLs for resource `Topic:pcap-nick-v2`:
 	User:metron has Allow permission for operations: All from hosts: *

Current ACLs for resource `Group:metron`:
 	User:metron has Allow permission for operations: All from hosts: *
  
  
[root@y136 ~]# kafka-acls.sh \
>     --authorizer kafka.security.auth.SimpleAclAuthorizer \
>     -authorizer-properties zookeeper.connect=y113:2181 \
>     --add \
>     --allow-principal User:fastcapa \
>     --topic pcap-nick-v2 \
>     --group fastcapa
Adding ACLs for resource `Topic:pcap-nick-v2`:
 	User:fastcapa has Allow permission for operations: All from hosts: *

Adding ACLs for resource `Group:fastcapa`:
 	User:fastcapa has Allow permission for operations: All from hosts: *

Current ACLs for resource `Topic:pcap-nick-v2`:
 	User:fastcapa has Allow permission for operations: All from hosts: *
	User:metron has Allow permission for operations: All from hosts: *

Current ACLs for resource `Group:fastcapa`:
 	User:fastcapa has Allow permission for operations: All from hosts: *

Start Fastcapa

[root@y138 metron-sensors]# fastcapa -l 8-15 --huge-dir /mnt/huge_1GB -- -t pcap-nick-v2 -c /etc/fastcapa.ycluster -b 128 -x 262144
EAL: Detected 32 lcore(s)
EAL: Probing VFIO support...
EAL: PCI device 0000:01:00.0 on NUMA socket 0
EAL:   probe driver: 8086:1521 net_e1000_igb
EAL: PCI device 0000:01:00.1 on NUMA socket 0
EAL:   probe driver: 8086:1521 net_e1000_igb
EAL: PCI device 0000:09:00.0 on NUMA socket 0
EAL:   probe driver: 1137:43 net_enic
EAL: PCI device 0000:0a:00.0 on NUMA socket 0
EAL:   probe driver: 1137:43 net_enic
EAL: PCI device 0000:81:00.0 on NUMA socket 1
EAL:   probe driver: 8086:10fb net_ixgbe
EAL: PCI device 0000:81:00.1 on NUMA socket 1
EAL:   probe driver: 8086:10fb net_ixgbe
[ -t KAFKA_TOPIC ] defined as pcap-nick-v2
[ -c KAFKA_CONFIG ] defined as /etc/fastcapa.ycluster
[ -b BURST_SIZE ] defined as 128
[ -x TX_RING_SIZE ] defined as 262144
[ -p PORT_MASK ] undefined; defaulting to 0x01
[ -r NB_RX_DESC ] undefined; defaulting to 1024
[ -q NB_RX_QUEUE ] undefined; defaulting to 2
USER1: Initializing port 0
USER1: Device setup successfully; port=0, mac=90 e2 ba d9 3c f9
USER1: Launching receive worker; worker=0, core=9, queue=0
USER1: Receive worker started; core=9, socket=1, queue=0 attempts=0
USER1: Launching receive worker; worker=1, core=10, queue=1
USER1: Launching transmit worker; worker=0, core=11 ring=0
USER1: Receive worker started; core=10, socket=1, queue=1 attempts=0
USER1: Transmit worker started; core=11, socket=1
USER1: Launching transmit worker; worker=1, core=12 ring=1
USER1: Transmit worker started; core=12, socket=1
USER1: Launching transmit worker; worker=2, core=13 ring=0
USER1: Transmit worker started; core=13, socket=1
USER1: Launching transmit worker; worker=3, core=14 ring=1
USER1: Transmit worker started; core=14, socket=1
USER1: Launching transmit worker; worker=4, core=15 ring=0
USER1: Transmit worker started; core=15, socket=1
USER1: Starting to monitor workers; core=8, socket=1


      ----- in -----  --- queued --- ----- out ----- ---- drops ----
[nic]               0               -               -               -
[rx]                0               -               0               0
[tx]                0               -               0               0
[kaf]               0               0               0               0

Validate Offsets in Kafka

[root@y137 ~]# kafka-run-class.sh \
>     kafka.tools.GetOffsetShell \
>     --broker-list y135:6667 \
>     --topic pcap-nick-v2 \
>     --security-protocol PLAINTEXTSASL \
>     --time -1 | \
>     grep pcap | \
>     awk -F: '{p+=$3} END {print p}'
[2017-05-10 18:56:43,754] WARN TGT renewal thread has been interrupted and will exit. (org.apache.kafka.common.security.kerberos.KerberosLogin)
0

Start Packet Replay

[root@y137 ~]# time tcpreplay -i enp129s0f1 --loop=0 --stats=15 --preload-pcap --mbps 1100 example.pcap
File Cache is enabled
Actual: 3166583 packets (2062501418 bytes) sent in 15.00 seconds.
Rated: 137499900.0 Bps, 1099.99 Mbps, 211105.37 pps
Actual: 6333143 packets (4125002477 bytes) sent in 30.00 seconds.
Rated: 137499900.0 Bps, 1099.99 Mbps, 211104.63 pps

Wait for 15 minutes

Stop Fastcapa (should have stopped tcpreplay first).

      ----- in -----  --- queued --- ----- out ----- ---- drops ----
[nic]       188713365               -               -               -
[rx]        188713413               -       188713413               0
[tx]        188713419               -       188713418               1
[kaf]       188713427          635436       188503813               0
^CUSER1: Exiting on signal '2'
USER1: Finished monitoring workers; core=8, socket=1
USER1: Transmit worker finished; core=14, socket=1
USER1: Transmit worker finished; core=11, socket=1
USER1: Transmit worker finished; core=15, socket=1
USER1: Receive worker finished; core=10, socket=1, queue=1
USER1: Transmit worker finished; core=13, socket=1
USER1: Transmit worker finished; core=12, socket=1
USER1: Receive worker finished; core=9, socket=1, queue=0
USER1: Closing all Kafka connections
USER1: '150235' message(s) queued on fastcapa-y138-enp129s0f1#producer-1
USER1: '152215' message(s) queued on fastcapa-y138-enp129s0f1#producer-2
USER1: '135425' message(s) queued on fastcapa-y138-enp129s0f1#producer-3
USER1: '135012' message(s) queued on fastcapa-y138-enp129s0f1#producer-4
USER1: '140736' message(s) queued on fastcapa-y138-enp129s0f1#producer-5
USER1: Waiting for '150235' message(s) on fastcapa-y138-enp129s0f1#producer-1
USER1: Waiting for '19162' message(s) on fastcapa-y138-enp129s0f1#producer-1
USER1: Waiting for '15088' message(s) on fastcapa-y138-enp129s0f1#producer-1
USER1: Waiting for '11804' message(s) on fastcapa-y138-enp129s0f1#producer-1
USER1: Waiting for '9699' message(s) on fastcapa-y138-enp129s0f1#producer-1
USER1: Waiting for '7938' message(s) on fastcapa-y138-enp129s0f1#producer-1
USER1: Waiting for '6170' message(s) on fastcapa-y138-enp129s0f1#producer-1
USER1: Waiting for '4687' message(s) on fastcapa-y138-enp129s0f1#producer-1
USER1: Waiting for '3491' message(s) on fastcapa-y138-enp129s0f1#producer-1
USER1: Waiting for '2531' message(s) on fastcapa-y138-enp129s0f1#producer-1
USER1: Waiting for '1667' message(s) on fastcapa-y138-enp129s0f1#producer-1
USER1: Waiting for '838' message(s) on fastcapa-y138-enp129s0f1#producer-1
USER1: Waiting for '209' message(s) on fastcapa-y138-enp129s0f1#producer-1
USER1: Waiting for '209' message(s) on fastcapa-y138-enp129s0f1#producer-1
USER1: All messages cleared on fastcapa-y138-enp129s0f1#producer-1
USER1: Waiting for '152227' message(s) on fastcapa-y138-enp129s0f1#producer-2
USER1: Waiting for '137' message(s) on fastcapa-y138-enp129s0f1#producer-2
USER1: All messages cleared on fastcapa-y138-enp129s0f1#producer-2
USER1: Waiting for '135437' message(s) on fastcapa-y138-enp129s0f1#producer-3
USER1: All messages cleared on fastcapa-y138-enp129s0f1#producer-3
USER1: Waiting for '135026' message(s) on fastcapa-y138-enp129s0f1#producer-4
USER1: All messages cleared on fastcapa-y138-enp129s0f1#producer-4
USER1: Waiting for '140748' message(s) on fastcapa-y138-enp129s0f1#producer-5
USER1: All messages cleared on fastcapa-y138-enp129s0f1#producer-5

Then I quickly stopped tcpreplay.

Actual: 186828128 packets (121687547326 bytes) sent in 885.00 seconds.
Rated: 137499900.0 Bps, 1099.99 Mbps, 211105.14 pps
^C User interrupt...
sendpacket_abort
Actual: 189510624 packets (123434738852 bytes) sent in 897.07 seconds.
Rated: 137499900.0 Bps, 1099.99 Mbps, 211105.15 pps
Flows: 68 flows, 0.07 fps, 189511069 flow packets, 0 non-flow
Statistics for network device: enp129s0f1
	Successful packets:        189510624
	Failed packets:            0
	Truncated packets:         0
	Retried packets (ENOBUFS): 0
	Retried packets (EAGAIN):  0

real	14m59.210s
user	10m27.316s
sys	4m30.148s

Then count the number of records in Kafka based on the offset.

[root@y137 ~]# kafka-run-class.sh  kafka.tools.GetOffsetShell \
	--broker-list y135:6667 \    
	--topic pcap-nick-v2 \    
	--security-protocol PLAINTEXTSASL \
	--time -1 | grep pcap |  awk -F: '{p+=$3} END {print p}'
[2017-05-10 19:23:34,400] WARN TGT renewal thread has been interrupted and will exit. (org.apache.kafka.common.security.kerberos.KerberosLogin)
71329474
[root@y137 ~]# kafka-run-class.sh  kafka.tools.GetOffsetShell \
	--broker-list y135:6667 \    
	--topic pcap-nick-v2 \    
	--security-protocol PLAINTEXTSASL \
	--time -1 | grep pcap
pcap-nick-v2:8:5921357
pcap-nick-v2:2:5897177
pcap-nick-v2:11:5951553
pcap-nick-v2:5:5937898
pcap-nick-v2:4:5971451
pcap-nick-v2:7:5912931
pcap-nick-v2:1:5906233
pcap-nick-v2:10:5978660
pcap-nick-v2:9:6001514
pcap-nick-v2:3:5929275
pcap-nick-v2:6:6011165
pcap-nick-v2:0:5910260
[2017-05-10 19:23:42,759] WARN TGT renewal thread has been interrupted and will exit. (org.apache.kafka.common.security.kerberos.KerberosLogin)

Results

Tcpreplay Says.. 189,510,624
Fastcapa Says... 189,348,863
Kafka Says... 71,329,474

Root Cause:

The max.message.bytes on the broker was set to 1MB, while it was set much higher in Fastcapa in /etc/fastcapa.ycluster. I set this back to the default value of 1 MB in Fastcapa. This seems to fix the issue.

The problem was not being reported because in the message delivery callback, Fastcapa was assuming that all callbacks indicated success. This is not the case. You have to check the err field which would indicate a permeneant delivery error. After adding this check it started to report that the messages were too large.

Test2

Offset before the test.

[root@y137 ~]# kafka-run-class.sh     kafka.tools.GetOffsetShell     --broker-list y135:6667     --topic pcap-nick-v2     --security-protocol PLAINTEXTSASL     --time -1 |     grep pcap |     awk -F: '{p+=$3} END {print p}'
[2017-05-10 20:35:14,072] WARN TGT renewal thread has been interrupted and will exit. (org.apache.kafka.common.security.kerberos.KerberosLogin)
91840723

Start Fastcapa

[root@y138 fastcapa]# fastcapa -l 8-15 --huge-dir /mnt/huge_1GB -- -t pcap-nick-v2 -c /etc/fastcapa.ycluster -b 128 -x 262144 -s fastcapa-kafka.log
EAL: Detected 32 lcore(s)
EAL: Probing VFIO support...
EAL: PCI device 0000:01:00.0 on NUMA socket 0
EAL:   probe driver: 8086:1521 net_e1000_igb
EAL: PCI device 0000:01:00.1 on NUMA socket 0
EAL:   probe driver: 8086:1521 net_e1000_igb
EAL: PCI device 0000:09:00.0 on NUMA socket 0
EAL:   probe driver: 1137:43 net_enic
EAL: PCI device 0000:0a:00.0 on NUMA socket 0
EAL:   probe driver: 1137:43 net_enic
EAL: PCI device 0000:81:00.0 on NUMA socket 1
EAL:   probe driver: 8086:10fb net_ixgbe
EAL: PCI device 0000:81:00.1 on NUMA socket 1
EAL:   probe driver: 8086:10fb net_ixgbe
[ -t KAFKA_TOPIC ] defined as pcap-nick-v2
[ -c KAFKA_CONFIG ] defined as /etc/fastcapa.ycluster
[ -b BURST_SIZE ] defined as 128
[ -x TX_RING_SIZE ] defined as 262144
[ -s KAFKA_STATS ] defined as fastcapa-kafka.log
[ -p PORT_MASK ] undefined; defaulting to 0x01
[ -r NB_RX_DESC ] undefined; defaulting to 1024
[ -q NB_RX_QUEUE ] undefined; defaulting to 2
USER1: Initializing port 0
USER1: Device setup successfully; port=0, mac=90 e2 ba d9 3c f9
USER1: Appending Kafka client stats to 'fastcapa-kafka.log'
USER1: Launching receive worker; worker=0, core=9, queue=0
USER1: Receive worker started; core=9, socket=1, queue=0 attempts=0
USER1: Launching receive worker; worker=1, core=10, queue=1
USER1: Receive worker started; core=10, socket=1, queue=1 attempts=0
USER1: Launching transmit worker; worker=0, core=11 ring=0
USER1: Transmit worker started; core=11, socket=1
USER1: Launching transmit worker; worker=1, core=12 ring=1
USER1: Transmit worker started; core=12, socket=1
USER1: Launching transmit worker; worker=2, core=13 ring=0
USER1: Transmit worker started; core=13, socket=1
USER1: Launching transmit worker; worker=3, core=14 ring=1
USER1: Transmit worker started; core=14, socket=1
USER1: Launching transmit worker; worker=4, core=15 ring=0
USER1: Transmit worker started; core=15, socket=1
USER1: Starting to monitor workers; core=8, socket=1
...

Start Tcpreplay

[root@y137 ~]# time tcpreplay -i enp129s0f1 --loop=0 --stats=15 --preload-pcap --mbps 1100 example.pcap
File Cache is enabled
Actual: 3166583 packets (2062501418 bytes) sent in 15.00 seconds.
Rated: 137500000.0 Bps, 1100.00 Mbps, 211105.39 pps
...

Wait a few minutes.

Stop Tcpreplay.

Actual: 56998382 packets (37125015637 bytes) sent in 270.00 seconds.
Rated: 137499900.0 Bps, 1099.99 Mbps, 211105.02 pps
^C User interrupt...
sendpacket_abort
Actual: 59084143 packets (38483510428 bytes) sent in 279.08 seconds.
Rated: 137499900.0 Bps, 1099.99 Mbps, 211105.12 pps
Flows: 68 flows, 0.24 fps, 59084801 flow packets, 0 non-flow
Statistics for network device: enp129s0f1
	Successful packets:        59084143
	Failed packets:            0
	Truncated packets:         0
	Retried packets (ENOBUFS): 0
	Retried packets (EAGAIN):  0

real	4m41.411s
user	3m6.372s
sys	1m33.487s

Stop Fastcapa

      ----- in -----  --- queued --- ----- out ----- ---- drops ----
[nic]        59084143               -               -               -
[rx]         59084143               -        59084143               0
[tx]         59084143               -        59084143               0
[kaf]        59084143          501933        59034602               0


      ----- in -----  --- queued --- ----- out ----- ---- drops ----
[nic]        59084143               -               -               -
[rx]         59084143               -        59084143               0
[tx]         59084143               -        59084143               0
[kaf]        59084143               0        59084143               0
^CUSER1: Exiting on signal '2'
USER1: Finished monitoring workers; core=8, socket=1
USER1: Transmit worker finished; core=12, socket=1
USER1: Transmit worker finished; core=14, socket=1
USER1: Transmit worker finished; core=11, socket=1
USER1: Receive worker finished; core=10, socket=1, queue=1
USER1: Transmit worker finished; core=15, socket=1
USER1: Transmit worker finished; core=13, socket=1
USER1: Receive worker finished; core=9, socket=1, queue=0
USER1: Closing all Kafka connections
USER1: '0' message(s) queued on fastcapa-y138-enp129s0f1#producer-1
USER1: '1' message(s) queued on fastcapa-y138-enp129s0f1#producer-2
USER1: '0' message(s) queued on fastcapa-y138-enp129s0f1#producer-3
USER1: '0' message(s) queued on fastcapa-y138-enp129s0f1#producer-4
USER1: '0' message(s) queued on fastcapa-y138-enp129s0f1#producer-5
USER1: All messages cleared on fastcapa-y138-enp129s0f1#producer-1
USER1: Waiting for '1' message(s) on fastcapa-y138-enp129s0f1#producer-2
USER1: All messages cleared on fastcapa-y138-enp129s0f1#producer-2
USER1: All messages cleared on fastcapa-y138-enp129s0f1#producer-3
USER1: All messages cleared on fastcapa-y138-enp129s0f1#producer-4
USER1: All messages cleared on fastcapa-y138-enp129s0f1#producer-5

Check the latest offset in Kafka

[root@y137 ~]# kafka-run-class.sh     kafka.tools.GetOffsetShell     --broker-list y135:6667     --topic pcap-nick-v2     --security-protocol PLAINTEXTSASL     --time -1 |     grep pcap |     awk -F: '{p+=$3} END {print p}'
[2017-05-10 20:47:47,718] WARN TGT renewal thread has been interrupted and will exit. (org.apache.kafka.common.security.kerberos.KerberosLogin)
150924866

Results

Pkt Count
Tcpreplay Says.. 59,084,143
Fastcapa Says... 59,084,143
Kafka Says... 59,084,143
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment