crowdmatt/setting-up-a-golang-server-with-kafka-to-s3.md

## setting-up-a-golang-server-with-kafka-to-s3.md

      
    Raw
  

              setting-up-a-golang-server-with-kafka-to-s3.md
            
          
    How to setup a Go / Golang WebServer with Kafka Log Aggregation to S3

This is a work in progress.
Install Kafka

Download the latest Kafka, which you can find at: http://kafka.apache.org/downloads.html
cd ~
curl -O http://apache.cs.utah.edu/incubator/kafka/kafka-0.7.2-incubating/kafka-0.7.2-incubating-src.tgz
tar -xvzf kafka-0.7.2-incubating-src.tgz
rm kafka-0.7.2-incubating-src.tgz
cd kafka-0.7.2-incubating-src

Kafka setup requires javac which only comes in the java-devel package and isn't installed by default, so run:
sudo yum install java-devel

Let's build it and package it up for release:
cd ~/kafka-0.7.2-incubating-src
./sbt
> update
> package
> release-zip

Command-C out of SBT, copy the resulting release-zip so we can work with it:
cp ~/kafka-0.7.2-incubating-src/core/dist/kafka-0.7.2.zip ~/
unzip ~/kafka-0.7.2.zip -d ~/kafka-0.7.2
rm ~/kafka-0.7.2.zip
sudo mkdir /usr/local/apache-kafka
sudo cp -rv ~/kafka-0.7.2 /usr/local/apache-kafka/
vim ~/.bash_profile

Add the following lines to your .bash_profile:
export KAFKA_HOME=/usr/local/apache-kafka/kafka-0.7.2

Now source the bash_profile so you have access to the new env vars:
source ~/.bash_profile

Find a Kafka S3 Consumer

There seem to be a small handful of Kafka S3 Consumers, which is a good thing.  Just take a look at this github search: https://github.com/search?q=kafka+s3
I ended up writing a custom one, because I couldn't get the others to work very easily.
It's located at https://github.com/crowdmob/kafka-s3-go-consumer and we'll be using it in production.
Install Go / Golang on EC2

Let's get go!
cd ~
curl -O https://go.googlecode.com/files/go1.0.3.linux-amd64.tar.gz
sudo tar -C /usr/local -xzf ~/go1.0.3.linux-amd64.tar.gz
rm ~/go1.0.3.linux-amd64.tar.gz
vim ~/.bash_profile

Of course, we have to add the path of go in our path again:
export PATH=$PATH:/usr/local/go/bin

Once you source ~/.bash_profile, you should be able to run go at the command line to make sure everything was installed!
Install the Consumer

First we need git, so let's install it.
sudo yum install git

We can install the consumer simply with:
cd ~
git clone https://github.com/crowdmob/kafka-s3-go-consumer.git

Then, configure it's properties file:
cd ~
cp ~/kafka-s3-go-consumer/consumer.example.properties ~/consumer.properties
vim ~/consumer.properties

Change the consumer properties file to suit your needs.
Optionally, you can put the consumer in /usr/local/, like:
sudo mkdir /usr/local/crowdmob-kafka-consumer
sudo mv kafka-s3-go-consumer /usr/local/crowdmob-kafka-consumer/

About Producing Kafka Events from Go

The most up-to-date Kafka library for Go seemed to be this: https://github.com/jedsmith/kafka , so we ended up forking it updating it a bit further, at https://github.com/crowdmob/kafka
In your go file, simply add it to imports, like this:
package main

import (
  "github.com/crowdmob/kafka"
)

func main() {
  broker := kafka.NewBrokerPublisher("localhost:9092", "events", 0)
  broker.Publish(kafka.NewMessage([]byte("tesing 1 2 3")))
}

I put that in a file called ~/gotest/producer.go.
You'll also need to go get the kafka go implementation:
sudo /usr/local/go/bin/go get github.com/crowdmob/kafka

Okay!  Now we're all setup, and just have to turn on all the services we just installed :)
Run Zookeeper, Kafka Server, and S3 Consumer

Let's run a simple kafka server, before we daemonize it:
And of course, that requires first starting a zookeeper server.
~/kafka-0.7.2-incubating-src/bin/zookeeper-server-start.sh ~/kafka-0.7.2-incubating-src/config/zookeeper.properties

In another shell, you'll need to run the kafka server as well:
~/kafka-0.7.2-incubating-src/bin/kafka-server-start.sh ~/kafka-0.7.2-incubating-src/config/server.properties

In another shell, we can now start up the s3 consumer that we created, using some properties file we put somewhere:
go run ~/kafka-s3-go-consumer/consumer.go -c ~/consumer.properties

Actually Produce From Go Script

Now it's time for the fun part! Let's write our first message to go, and see if it ends up in our s3 bucket.  Make sure you specify your AWS credentials and S3 bucket in ~/consumer.properties.
Also make sure ~/consumer.properties specifies which Kafka Topic(s) to listen to, and make sure that topic matches the topic you'll be producing to in Go.
If you've verified the buckets, credentials, and kafka topics, then you're ready to give the go producer a whirl.
Simply:
cd gotest
go run producer.go