Skip to content

Instantly share code, notes, and snippets.

@davidsnyder
Last active May 2, 2016 14:16
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save davidsnyder/7668060 to your computer and use it in GitHub Desktop.
Save davidsnyder/7668060 to your computer and use it in GitHub Desktop.
Kafka Troubleshooting
You can use the consumer script to verify that data is being received by the http listener (http_producer) on port 80 of the Kafka broker node.
sudo /usr/local/share/kafka/bin/kafka-console-consumer.sh --zookeeper [zk-private-ip]:2181 --topic [topic] --from-beginning
If you curl records at the kafka http listener, you should see them come out at the kafka broker node:
(On the Kafka listener node)
curl -XGET localhost:80 -d '{"name":"Bill","position":"Sales"}'
Note we didn't specify a topic to write to. The topic that the http_producer writes to is specified in the cluster definition. In this case, look for the listener facet of the Kafka cluster declaration of p1.rb.
app :http_producer do
type 'http_producer' ; topic 'raw' ; run_state :stop
end
You can also see the topic by viewing the options passed to the http_producer process that is running:
ps auxf | grep kafka.producer.http.topic
(On the Kafka Broker running the consumer client pointed at topic 'raw' you should see the record appear)
{"headers":{"Cookie":"COOKKEY=c6791876e9f43bd6bf5413fb1f0979fc;Expires=Wed, 27-Nov-13 22:50:40 GMT;Path=COOKPATH;Domain=COOKDOM","Host":"localhost","Content-Length":"34","User-Agent":"curl/7.22.0 (x86_64-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3","Content-Type":"application/x-www-form-urlencoded","Accept":"*/*"},"body":"{\"name\":\"Bill\",\"position\":\"Sales\"}","ip_address":"/127.0.0.1:57499","method":"GET","tsm":1385506240909,"uri":"/"}
Data written to the broker is stored on disk in /data/kfk:
du --all --bytes /data/kfk/kafka/journal/
94 /data/kfk/kafka/journal/test-0/00000000000000000000.kafka
133 /data/kfk/kafka/journal/test-0
1887315 /data/kfk/kafka/journal/raw-0/00000000000000000000.kafka
1887354 /data/kfk/kafka/journal/raw-0
1887518 /data/kfk/kafka/journal/
You can see there's a folder each for topics 'test' and 'raw'. As you curl records to the listener node, you should see the size of the journal increase.
ps auxf | less
You will see a number of processes running under supervision by runsv:
root 21844 0.0 0.0 164 4 ? Ss Nov25 0:00 \_ runsv kafka_http_producer-0
root 21845 0.0 0.0 184 48 ? S Nov25 0:00 \_ svlogd -tt /var/log/kafka-contrib/http_producer-0
root 22053 0.2 1.2 3125460 91664 ? Sl Nov25 3:03 \_ /usr/bin/java -server -Dkafka.producer.http.properties=....
Stopped processes will only have a runsv process and a child svlogd log rotation process running, but in this case the process has been started and you can view the command line used to start the service (process 22053).
The scripts used to start and stop processes are stored in /etc/sv.
In the case of kafka_http_producer-0, the run script that generated process 22053 above is found at /etc/sv/kafka_http_producer-0/run
You can start, stop, or check the status of this process by running `sudo service kafka_http_producer-0 [start|stop|status]`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment