Skip to content

Instantly share code, notes, and snippets.

@mumrah
mumrah / reviewers.py
Last active July 7, 2021 20:45
Simple Python3 script to help with the "Reviewers" line in Apache Kafka PRs. Must be run from within the Git repo
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
from collections import defaultdict
import operator
import os
import re
def prompt_for_user():
@thelabdude
thelabdude / solr_alluxio.md
Last active September 21, 2017 15:09
Notes for running Solr on Alluxio

Here are some tips on getting started with using Alluxio as the filesystem for Solr indexes. I've tested with Alluxio 1.5.0 and Solr 6.6.0, but these instructions should work for other versions.

SOLR_TIP=<root directory where Solr is installed on your server>
ALLUXIO_HOME=<root directory where Alluxio is installed on your server>

Create an alluxio config directory to load into Solr's ZK with the following settings in solrconfig.xml:

   <directoryFactory name="DirectoryFactory"
@mumrah
mumrah / AddToMap.java
Created June 11, 2013 02:39
A Pig UDF that allows you to modify a map by adding additional key/value pairs
import java.io.IOException;
import java.util.Map;
import org.apache.pig.EvalFunc;
import org.apache.pig.data.DataType;
import org.apache.pig.data.Tuple;
/**
* Simple UDF to allow modifying an existing map[] datum
*
* Usage:
@piscisaureus
piscisaureus / pr.md
Created August 13, 2012 16:12
Checkout github pull requests locally

Locate the section for your github remote in the .git/config file. It looks like this:

[remote "origin"]
	fetch = +refs/heads/*:refs/remotes/origin/*
	url = git@github.com:joyent/node.git

Now add the line fetch = +refs/pull/*/head:refs/remotes/origin/pr/* to this section. Obviously, change the github url to match your project's URL. It ends up looking like this:

@mumrah
mumrah / kafka-rest.md
Created August 3, 2012 14:35
Kafka REST proposal

REST interface for Kafka

Taking inspiration from the projects page...

I think it would really useful and pretty simple to add a REST interface to Kafka. I could see two possible routes (not mutually exclusive): using HTTP as a dumb transport layer, and using it with different media types for application-friendly consumption of messages (JSON, XML, etc). The dumb transport would be useful for languages without first-class clients, and the content-type extension would be useful for writing web apps that are Kafka-enabled.

HTTP as a transport

Consuming data

@stonegao
stonegao / AkkaKafkaMailboxTest.scala
Created May 1, 2012 07:24 — forked from mardambey/AkkaKafkaMailboxTest.scala
Akka 2.0 actors with Kafka backed durable mailboxes.
import akka.actor.Actor
import akka.actor.ActorSystem
import akka.agent.Agent
import com.typesafe.config.ConfigFactory
import akka.event.Logging
import akka.actor.Props
import kafka.utils.Utils
import java.nio.ByteBuffer
@mumrah
mumrah / websocketserver.py
Created August 7, 2010 17:01
Simple WebSockets in Python
import time
import struct
import socket
import hashlib
import sys
from select import select
import re
import logging
from threading import Thread
import signal
from json import JSONEncoder
from pymongo.objectid import ObjectId
class MongoEncoder(JSONEncoder):
def _iterencode(self, o, markers=None):
if isinstance(o, ObjectId):
return """ObjectId("%s")""" % str(o)
else:
return JSONEncoder._iterencode(self, o, markers)
@mumrah
mumrah / gist:477121
Created July 15, 2010 15:37
ObjectId dereferencing in JavaScript for MongoDB
var _deref = function (doc, field, col) {
var oid = ObjectId(doc[field]);
delete doc[field];
doc[field] = db[col].findOne({_id:oid});
return doc;
}
var deref = function(field, collection){
// C-C-C-Closure!!
return function(doc){