Skip to content

Instantly share code, notes, and snippets.

View cjmatta's full-sized avatar

Christopher Matta cjmatta

View GitHub Profile
@cjmatta
cjmatta / ansible-benchmarks.md
Last active June 26, 2023 19:35
A method for orchestrating distributed Kafka benchmarks using Ansible

A method for orchestrating distributed Kafka benchmarks using Ansible

Kafka benchmarks are typically run using a single producer and consumer against a single topic, and the producer and consumer are run at close to maximum write/read speeds. In the real world, a Kafka cluster is more often serving many lower throughput producers and consumers. Ansible allows for a benchmarking method that sets up any number of topics and many producers and consumers.

Ansible playbooks allow us to run a number of tasks against a distributed set of clients both synchronously and asynchronously.

Topic setup

Before we can run tests we need topics to test against. This play sets up a number of topics with various partition configurations:

- name : Setup
@cjmatta
cjmatta / shellinabox-playbook.yml
Created August 9, 2018 14:05
Ansible playbook to install shellinabox and secure it with a letsencrypt certificate
---
- hosts: all
vars:
certificate_email: your@email.com
domain_name: my.domain.com
become: yes
tasks:
- name: install certbot prereq
yum:
name: epel-release
@cjmatta
cjmatta / provision_storage.yml
Created June 26, 2018 19:16
Ansible playbook to provision /var/lib/kafka
---
- hosts: broker
tasks:
- name: create filesystem
filesystem:
fstype: ext4
dev: /dev/xvdf
- name: create /var/lib/kafka directory
file:
path: /var/lib/kafka
@cjmatta
cjmatta / create_ansible_inventory.py
Created June 26, 2018 18:13
Small jinja2-based template script for creating an inventory file for https://github.com/confluentinc/cp-ansible
#!/usr/bin/env python
# This script is meant to be used in conjunction with the JSON formatted
# output of `https://github.com/cjmatta/cp-poc-terraform
# Usage: terraform output -json` | ./create_ansible_inventory.py
import json
from jinja2 import Template
import sys
template = Template("""all:
vars:
@cjmatta
cjmatta / README.md
Last active April 12, 2024 11:53
Secure Kafka Connect (SASL_SSL)
@cjmatta
cjmatta / sample_config.json
Last active July 28, 2017 16:39
Script to submit JSON connector config
{
"name": "wikipedia-irc",
"config": {
"producer.interceptor.classes": "io.confluent.monitoring.clients.interceptor.MonitoringProducerInterceptor",
"connector.class": "org.cmatta.kafka.connect.irc.IrcSourceConnector",
"irc.server": "irc.wikimedia.org",
"kafka.topic": "wikipedia.raw",
"irc.channels": "#en.wikipedia,#en.wiktionary,#en.wikibooks,#en.wikinews,#es.wikipedia,#fr.wikipedia",
"tasks.max": "2"
}
@cjmatta
cjmatta / README.md
Last active August 22, 2019 12:43
Confluent Platform Docker

Complete Confluent Platform docker-compose.yml file. Includes an nginx configuration to load-balance between the rest-proxy and schema-registry components.

Running - make sure both docker-compose.yml and nginx_kafka.conf are in the same directory:

$ docker-compose up
@cjmatta
cjmatta / schema.json
Created April 29, 2016 20:55
user log-synth schema
[
{"name": "id", "class": "id"},
{"name": "name", "class": "name", "type": "first_last"},
{"name": "address", "class": "address"},
{"name":"gender", "class":"string", "dist":{"MALE":0.5, "FEMALE":0.5, "OTHER":0.02}},
{"name": "zip", "class": "zip"},
{"name": "ssn", "class": "ssn"}
]
@cjmatta
cjmatta / drillwrapper.sh
Last active February 4, 2016 15:22
A wrapper script for Drill's sqlline that asks for user/pass to avoid the password showing up in a process list.
#!/bin/bash
USERNAME=
PASSWORD=
DRILL_VER=drill-1.4.0
DRILL_LOC=/opt/mapr/drill
URL=jdbc:drill:
DPROP=~/prop$$
@cjmatta
cjmatta / TupleGenerator.java
Created November 17, 2015 18:33
TupleGenerator
package streamflow.spout.core;
import backtype.storm.spout.SpoutOutputCollector;
import backtype.storm.task.TopologyContext;
import backtype.storm.topology.BoltDeclarer;
import backtype.storm.topology.OutputFieldsDeclarer;
import backtype.storm.topology.base.BaseRichSpout;
import backtype.storm.tuple.Fields;
import backtype.storm.tuple.Values;
import backtype.storm.utils.Utils;